Implementation of Open Archival Information System

Size: px
Start display at page:

Download "Implementation of Open Archival Information System"

Transcription

1 Masaryk University Faculty of Informatics Implementation of Open Archival Information System Master s Thesis Šimon Hochla Brno, Fall 2017

2

3 Masaryk University Faculty of Informatics Implementation of Open Archival Information System Master s Thesis Šimon Hochla Brno, Fall 2017

4

5 This is where a copy of the official signed thesis assignment and a copy of the Statement of an Author is located in the printed version of the document.

6

7 Declaration Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Šimon Hochla Advisor: doc. RNDr. Tomáš Pitner, Ph.D i

8

9 Acknowledgements Acknowledgements I would like to express my graditude to my supervisor doc. Tomáš Pitner for his responsiveness and valuable suggestions. I would also like to thank my colleagues from InQool a.s. for their assistance with technologies and help in solving the practical issues. Last but not least, elaboration of this thesis would be impossible without the continual support of my family. iii

10 Abstract Digitization of information preserved in archives of memory institutions is a very actual topic. The final destination of the digitized content are electronic archives that store the data in a secure way respecting guidelines for storing of archival information. The OAIS reference model is a well-accepted standard that serves as a starting point when building of such an archive. This thesis is dedicated to the analysis of the Ingest functional entity from the OAIS model. In the theoretical part, it contains a design of a part of the archival solution ARCLib developed for the needs of memory institutions in Czech republic. In the practical part, the thesis provides a referential implementation by defining prototypes capturing the key functionalities of Ingest. iv

11 Keywords OAIS, archival system, digital libraries, ISO 14721:2012, Ingest, AR- CLib, Java v

12

13 Contents 1 Introduction 1 2 OAIS, Ingest functional entity Origin of OAIS Structure of OAIS archive OAIS environment OAIS functional model OAIS data model Structure of AIP Content Information Preservation description information Packaging information Descriptive information Ingest functional entity Receive Submission Quality Assurance Generate AIP function Generate Descriptive Information Coordinate Updates Analysis ARCLib project Functional requirements for Ingest Receive Submission Quality Assurance Generate AIP Generate Descriptive Information Coordinate Updates Technical requirements Non-functional requirements Metadata formats Design Architecture of system Domain model Management of Ingest processes vii

14 4.3.1 Batch management Implementation of prototypes Technologies Apache Maven Spring Framework Spring Boot Camunda BPM JBoss Hibernate Apache ActiveMQ Artemis Management of Ingest processes Validation of SIP Metadata extraction and ARCLib AIP XML generation SIP Profile Generation of elements in ARCLib AIP XML Task scheduling Antivirus control Fixity check File format analysis Conclusion 49 A Source code of prototypes 51 B Validation profile 53 C Task scheduling 55 Bibliography 57 viii

15 List of Figures 2.1 OAIS functional model, adapted from [4] Structure of AIP package, adapted from [4] Functions of Ingest entity, reproduced from [1] SIP package to AIP package transformation Architecture of the ARCLib system Domain model of the ARCLib system SIP package processing state diagram Batch processing state diagram Batch processing sequence diagram Task scheduling class diagram Fixity check class diagram 46 ix

16

17 1 Introduction Information technologies have revolutionized the way we access information. The Internet has truly made the world smaller by making the previously difficult to access information easily available for masses. However, there are some exceptions and one of it is information stored deep inside archives that preserve the data in its physical form. Libraries, museums, archives and other memory institutions realized that to keep pace with the digitized world they need to make their custodies available for a wider area of users besides the people physically visiting their archives. Consequently, an ongoing process of digitization has started over all of these places. The final destination of the digitized content are electronic archives in which the data is securely preserved and available to next generations. This thesis is dedicated to the analysis of a reference model for building of such an archive, OAIS model, and provides a prototypic implementation of a part of this model. The practical part of this thesis is used in the ARCLib solution owned by the Library of Czech Academy of Science. The solution is partially developed by the industry partner of Masaryk University InQool a.s. that is the company where I elaborated the practical part. Designing of an archival system is not an easy and straightforward process. Organizations responsible for building of an archive need to guarantee two kinds of securities, the physical security and the logical security. Ensuring a physical security translates as making the information securely saved and protected from external threads. Logical security is a more abstract term and it consists of measures and activities that ensure that the information is usable and understandable over a long period of time. OAIS standard provides a model of an archive that manages to ensure both of these securities by defining a functional and a data model. The functional model consists of several functional entities or modules starting with the act of saving the data to the archival system through the management and administration to the data access. This thesis elaborates thoroughly one of these modules, the module called Ingest. 1

18 1. Introduction Aim of this work is to analyze the OAIS standard with the focus on its Ingest functional entity and elaborate a list of guidelines or demonstrations of the design decisions for the practical implementation of Ingest. This is done in the theoretical part. In the practical part the goal is to define a referential implementation of the key use cases of Ingest by elaborating implementation prototypes. The second chapter discusses the OAIS model and its Ingest entity. The third chapter is devoted to the solution called ARCLib. It includes functional and non-functional requirements for the implementation specified by the owner of the ARCLib, Library of Czech Academy of Science. The fourth chapter contains a proposed architecture of the system that implements OAIS Ingest and describes the designed domain model. The last chapter describes the implementation of prototypes, discusses the used technologies and mentions the encountered problems and how they were solved. 2

19 2 OAIS, Ingest functional entity 2.1 Origin of OAIS OAIS stands for Open Archival Information System and it is the name of a reference model [1] that currently serves as a well-accepted starting point for organizations interested in building long term preservation systems [2]. It was created as a reflection of the need for standardization when in the process of digitization many archives appeared to have shared interests. OAIS is a conceptual framework with the goal to achieve a high degree of flexibility and level of abstraction, it does not define any specific implementation. The first edition approved by International Organization for Standardization (ISO) dates back to January 2002 as ISO Standard 14721; in 2012 it underwent revisions and updates and was released as ISO Standard 14721:2012 [3]. This second edition rather revised what has been introduced in the first version than brought any major updates [4]. The term open in OAIS means that the negotiation during the development process was open to any interested parties. The act of long term preservation of data is not as simple as storing the data to a secure place. Without knowing the information about how the data should be interpreted, the data can become impossible to understand when retrieved after a long period of time. The context of the data is as important as the data itself. Because of that, not only the data itself needs to be stored, but also the respective metadata, with all the information necessary for making the original information useful. A major purpose of the OAIS model is delimiting what exactly is necessary to store. If an archive wants to be declared as OAIS compliant, it must meet the six responsibilities defined in the OAIS standard [1, p. 38]. There is not a strict definition of what is and what is not an OAIS archive because of the high level of abstraction given by the flexibility of the model. Most archives that declare themselves as OAIS archives do that without any external audit. If not anything more, the reference model establishes a common language that is used by the people cooperating on building archives for a long term preservation. 3

20 2. OAIS, Ingest functional entity 2.2 Structure of OAIS archive There are three perspectives of an OAIS archive which the reference model elaborates. The first one is the OAIS environment describing the surrounding of the archive and the external entities operating with the system. The second one is the OAIS functional model specifying the elementary functional components of the OAIS archive and how they cooperate. The last one is the OAIS data model that elaborates the topic of data packages stored in the archive. It describes their structure, contents and meaning of all the necessary metadata information OAIS environment OAIS archive operates in the environment consisting of three types of external entities: Producers, Consumers and Management [1, p. 28]. 4 Producers are the originators of the data that is transferred to the archive. This transfer is restricted by a Submission Agreement that specifies the conditions the data needs to fulfill before saving to the archive. Consumers are the entities accessing the materials stored in the archive. The reference model defines a special type of consumers that is the Designated Community (DC). They are the primary users intended to read the information. The concept of the DC is one of the most important parts of the OAIS model. Making the information available and understandable for the DC is the major responsibility of an OAIS archive. The scope of the DC is determining the contents of the archive, the larger is the scope, the more metadata needs to be stored with the data to make the data understandable. Management entity is responsible for establishing and revising the policy of the OAIS activities. This are usually the beholders of the archive. Something else are the administrators responsible for daily routine management of the archive who operate from the inside of an OAIS system.

21 2. OAIS, Ingest functional entity OAIS functional model The functional model of OAIS [1, p. 44] shows the internal components of an OAIS system and the relationships between them. Figure 2.1: OAIS functional model, adapted from [4] Ingest - serves for accepting the information packages from Producers and preparing them for storing to Archival storage. This includes package transformations, validations, quality checks and adding new necessary information. Data management - provides access and control over the metadata stored in a database that are used in the lookup process of information packages. Moreover, it manages the internal OAIS s administrative data such as performance data or access statistics. 5

22 2. OAIS, Ingest functional entity Administration - is responsible for the daily management of an OAIS archive. This includes the interactions with Producers, Consumers and Management. Access - enables locating and delivering of the archived content in a suitable format to the Consumer. It provides interfaces for the lookup and access to the data as well as access control mechanisms. Archival Storage - is responsible for transferring of information packages to a persistent storage. More specifically it provides functions for storage, maintenance and retrieval of information. Besides that it supports error checking mechanism and disaster recovery. There is no direct interface to the Archival storage from the outside of an OAIS archive. Information packages are saved using the Ingest entity and can be accessed only through the Access entity. Preservation planning - monitors the environment of an OAIS archive checking for changes of expectations of the Designated Community and evolvement of access and storage technologies to ensure that the data stored in the archive remains accessible to and understandable by the Designated Community. 6

23 2. OAIS, Ingest functional entity OAIS data model OAIS defines a concept of information packages that are the basic archival units used by the storage and access functions. The data model specifies the types and structure of these information packages. An information package consists of the object that is the focus of the preservation and all the metadata helping the object to become securely stored, accessible, searchable, understandable and representable in the right way. There are three types of the information packages from the point of a lifecycle of a package: SIP, AIP and DIP packages. SIP Submission Information Package is the package in the state when it is ingested by Producer to the OAIS archive. This is the initial state and the package contains only a minimum of metadata information. The exact structure of the SIP must be negotiated beforehand between the Producer and the Management of the OAIS archive. AIP Archival Information Package is the version of the package that is stored in the OAIS archive. In comparison with the SIP, it has a complete set of metadata associated with the stored object. The new metadata are generated during the Ingest process. From the logical point of view the object and its respective metadata operate in the system as a single object, in reality they can be stored separately and later bound logically. DIP The third and final form of the information package is the Dissemination Package. This is the version delivered to the Consumer as a result of the access functions. It can consist of one or more AIP packages transformed to the form acceptable to the Consumer. DIP does not have to include all the metadata information preserved in an AIP package. 7

24 2. OAIS, Ingest functional entity 2.3 Structure of AIP The OAIS model defines a structure of AIP package ilustrated in Figure 2.2. Figure 2.2: Structure of AIP package, adapted from [4] Content Information The building process of an AIP package begins with the Content Data Object. This is the object that is the intent of the preservation and it can take form of either a digital object or a physical object. To make it understandable for the Designated Community it comes together with the Representation Information. Representation Information describes the appropriate presentation of the Content Data Object, the required hardware and software equipment, and also the semantics of the object, how the object should be interpreted in the right way. Taken together, Content Data Object and Representation Information make the Content Information. 8

25 2.3.2 Preservation description information 2. OAIS, Ingest functional entity The next mandatory part of AIP is the Preservation description information. This contains the additional metadata required for the long term preservation of the information package to support the OAIS processes. It consists of five parts: Reference information uniquely identifies the Content Information within the scope of the OAIS system and possibly contains identifiers of Content Information used by external systems that operate from the outside of the archive Context information describes the relationships of the Content Information to other Content Information objects Provenance information contains the history of the Content Information, tells about its origin and alterations made in its lifetime Fixity information consists of the validation keys for the integrity 1 checks of the Content Information Access rights information contains the rights and restrictions related to the Content information for both preservation to the archive and access by Consumer Packaging information Packaging information binds Content Information and Preservation Description Information into a single logical package. It can have a form of a set of file paths or a detailed packaging schema Descriptive information Descriptive Information contains the fields used in the search query at the moment of the AIP retrieval. It can take form of the indexes or any data supporting the finding aids. 1. Integrity is the assurance that information has not been modified by unauthorized subjects 9

26 2. OAIS, Ingest functional entity 2.4 Ingest functional entity Ingest functional entity provides the following functions [1, p. 48]: Figure 2.3: Functions of Ingest entity, reproduced from [1] Receive Submission Receive Submission function represents the infrastructure for a transfer of SIP from Producer to the archival system. The function returns a confirmation to the Producer after a successful SIP receipt or a request for resubmission in case of failure Quality Assurance Quality Assurance function validates that the SIP and all its associated files have been successfully transferred to the temporary storage area. For the digital objects this can include integrity checks. 10

27 2. OAIS, Ingest functional entity Generate AIP function Generate AIP function transforms one or more SIPs into one or more AIPs according to the rules given by the archival system so that the resulting AIP was meeting the archives standards. The transformation consists of file format conversions, addition of new Representation Information or changing the structure of the Content Information of the SIPs Generate Descriptive Information Generate Descriptive Information function supplements the generated AIP with the Descriptive Information. The Descriptive Information is extracted from the contents of SIP and complemented with information collected from other sources Coordinate Updates Coordinate Updates function performs the transfer of the AIP to Archival Storage and the Descriptive Information to Data Management. A successful transfer of AIP to Archival Storage is followed by adding the respective storage identification to the Descriptive Information. In case of an update of just the Descriptive Information the communication with Archival Storage can be omitted. 11

28

29 3 Analysis 3.1 ARCLib project ARCLib project [5] is a project of the Digital Library of the Czech Academy of Sciences 1 whose aim is to create an open-source system for a preservation of structured documents and its metadata for the needs of Czech memory institutions. The project was initiated by the current situation in Czech Republic where there are no complete archival solutions available except of the custom tailored commercial solutions for the big institutions like National Library of the Czech Republic. The target users of ARCLib are regional libraries, scientific institutions, museums, galleries and archives. The system is designed in correspondence with ČSN ISO (OAIS) and will implement all the function modules of OAIS. The project is estimated to be deployed in the year 2020 in a pilot plant in the Czech Academy of Science and the National Museum Library. [5] The project analysis was elaborated by a team of experts possessing the domain knowledge of the long term persistence systems. In the initial phase it was reconsidered to utilize some of the existing open source solutions and adjust it for the needs of Czech memory institutions. Unfortunately, none of these solutions implements all parts of the OAIS model. The open source system Archivematica [6] is still under heavy development and at the time it only provides functions for creation and access to information packages. Similarly, the system Roda, used in the the National digital archive of Portugal, is lacking the Preservation planning module and its community is not very active. [7] Subsequently, it was decided to use Archivematica solely for creation of SIP packages and complement the other missing functions with ARCLib. Moreover, for the interoperability with other systems ARCLib will support the SIP packages created in the system ProArc [8] used in the National Library of the Czech Republic. The InQool company is responsible for the implementation analysis and design of the ARCLib system on basis of the project analysis

30 3. Analysis The project analysis [9] has been finished earlier this year and is not planned to be changed during the implementation process. As this thesis is devoted to the OAIS Ingest functional entity, in this chapter I summarize only the project requirements related to Ingest. 3.2 Functional requirements for Ingest The project analysis consists of a list of system requirements. For a better comprehension I have grouped the requirements according to the functions of the Ingest entity described in section Receive Submission In this function ARCLib receives SIP packages, saves them to the temporary storage and scans their contents for viruses by an antivirus check. The input packages are created by an external application like Archivematica or ProArc. Conversion of unprocessed input data to the format SIP is not supported. If the producer owns only the scanned data and metadata it is necessary to use an external application for the SIP package creation. Prior to the SIP ingestion 2, the producer needs to provide the system with a corresponding Validation profile and Ingest workflow profile in case they have not already been defined previously. Validation profile - human configurable structured file that specifies the process of SIP validation performed in the Quality Assurance function Ingest workflow profile - human configurable structured file that controls the process of ingestion itself 2. The term ingestion referes to the act of receiving a SIP from Producer followed by the processes performed in the Ingest entity 14

31 3. Analysis Quality Assurance Quality Assurance consists of two phases. The first one is performed in the production systems as ProArc or Archivematica and involves a complex identification of all formats and extractions of metadata. ARCLib is receiving the SIP packages in an already pre-prepared state. The second part complements the first one and is performed by ARCLib. It consists of: validation of the SIP contents and structure according to the agreement saved in the Validation profile validation of the SIP completeness by checking the fixity checks of the existence and fulfillment of all descriptive and preservation metadata control of the technical metadata extracted in the first phase for individual objects 15

32 3. Analysis Generate AIP Instead of transforming the SIP package into an AIP package, in AR- CLib the resulting AIP package is generated as two objects, the original SIP package and an associated ARCLib AIP XML. ARCLib AIP XML is a metadata structure specific for ARCLib. ARCLib AIP XML contains information from two sources, one part is extracted from the SIP package, the other part is generated by the system. ARCLib extracts from SIP some descriptive, technical and administrative metadata. The part generated by the system includes information about the Producer, the used Ingest workflow profile and Validation profile, audit information and result of the file format identification. Figure 3.1: SIP package to AIP package transformation ARCLib AIP XML The schema of ARCLib AIP XML is based on the METS data standard, that serves as a building block for the metadata definition and enables to incorporate other metadata standards as Dublin Core, PREMIS or MODS. These formats are described in section 3.5. Moreover, ARCLib AIP XML defines its own metadata structure based on PREMIS for storing the part of provenance metadata that is unable to be captured by the standard schemas. 16

33 3. Analysis Table 3.1: Structure of ARCLib AIP XML Section Root element and header Descriptive metadata Administrative metadata Links to files Structural map of AIP Description identifiers, originator and producer of the package, information about its creation and possible modification, values generated by ARCLib and extracted from SIP, compulsory nested metadata record in format Dublin Core, values extracted from the SIP, possibly extended by an additional record defined in the scope of ARCLib Technical metadata technical parameters of the object, partly created by ARCLib, partly extracted from the SIP, related to the whole package and particular files Provenance metadata events in the history of the object and checks in the process of Ingest, partially generated by ARCLib, partially extracted from the SIP, metadata standard PREMIS, related to the whole package and to the particular files Rights information rights related to the digital object, extracted from the SIP, not compulsory Additional bibliographic data extracted from the SIP, not compulsory, possibly in MODS format links to files stored in the AIP, generated by ARCLib from the SIP contents structural map of AIP contents, generated by ARCLib from the SIP contents 17

34 3. Analysis Next structure that plays a key role in the AIP package generation in ARCLib is a SIP profile. SIP profile SIP profile is a structured file that contains rules for mapping of the compulsory and non-compulsory metadata contained in the original SIP to the structure ARCLib AIP XML. The profile is specific to a set of packages based on the type of SIP, production system and additional parameters coming from the agreement between the producer of SIP and management of the ARCLib system (Submission agreement). It is necessary to make the SIP profiles available to all the users of the system and at the same time keep them archived both in the archival system as well as externally Generate Descriptive Information ARCLib extracts the descriptive metadata from the metadata of SIP package and saves them to the ARCLib AIP XML and to a database Coordinate Updates ARCLib transfers the resulting AIP package to Archival storage and stores the location of AIP to database. The associated ARCLib AIP XML can be saved either together with the AIP or separately and later associated on demand. ARCLib supports versioning of AIP. There are two options for versioning: 1. Versioning of just the ARCLib AIP XML. In this option the package content stays the same and only the ARCLib AIP XML is updated. 2. Versioning of all the AIP contents. Here the system updates the ARCLib AIP XML as well as the package content. The original version of AIP always remains stored in the system and the new version is connected to the original version using metadata. 18

35 3. Analysis 3.3 Technical requirements Processing of SIP package is controlled by a BPM process 3 : BPM process consists of a sequence of steps Execution of BPM process is instructed by Ingest workflow profile Steps are realized by scripts, scripts can call external tools running on the OS Execution of the BPM process is recorded to log and database, part of the log will get to the AIP At the end of the BPM process execution the producer can be informed of the result by a mail notification Management of BPM processes: ARCLib processes multiple BPM workflows for multiple SIP packages in parallel BPM tasks waiting to be executed are stored in a queue so that the system resources were not overloaded It is possible to activate a new BPM workflow: by an administrator in the middle of a system runtime: a manual launch based on a time schedule: scheduling can be suspended and started again anytime if a specified location (filesystem, external FTP/NFS location) contains a new unprocessed SIP package Tasks are able to be executed in batches or individually After a system crash in the middle of an Ingest process that is followed by a system start up: the unfinished Ingest process can be rolled back or run again 3. BPM process is an expression of a business process using a diagram that is executable by a machine yet easy to understand for everyone 19

36 3. Analysis 3.4 Non-functional requirements Horizontal scalability 4 of performance Modularity 5 Configuration using GUI or XML Platform independence (although Linux OS is preferred) Robust open source relational database (preferably PostgreSQL) Java (OpenJDK) Connection with third party tools using interface HTTPS communication 3.5 Metadata formats Over the past two decades there were created multiple mainstream metadata standards whose purpose is to facilitate implementation of the data model of an OAIS archive. ARCLib uses several of these standards in the ARCLib AIP XML, namely METS, PREMIS, MODS, Dublin Core, MIX and ALTO XML. Additionally, ARCLib is expecting to receive SIP packages in the packaging format BagIt. BagIt serves as an encapsulation of all the SIP package content. METS The Metadata Encoding and Transmission Standard (METS) is a metadata standard for encoding descriptive, administrative, and structural metadata of objects within a digital library, expressed using the XML language. METS is a container type standard, it enables usage of other standards. The created XML enables to describe the hierarchical structure of digital objects and store their associated metadata. [10] 4. Scalability is capability of a system to handle a growing amount of work 5. Modularity is ability to interchange the implementation of a particular function without significant changes to the rest of the system 20

37 3. Analysis PREMIS PREservation Metadata: Implementation Strategies is a schema that has been developed for storing of preservation metadata. It typically resides inside of a METS document. PREMIS is able to express the technical information of the objects as well as the history of their processing and management. [11] Dublin Core Schema for storing of Descriptive information. It is characterized by simplicity and is suitable for a basic description of data objects. Dublin core elements set consists of fifteen properties for use in object description, these elements can be further extended by qualifiers. [12] MODS Metadata Object Description Schema is a schema for storing of Descriptive information that enables a detailed description of digital objects. The built in data granularity is higher compared to Dublin Core.[13] copyrightmd Schema for expressing copyright law metadata. [14] MIX Metadata for Images in XML Schema provides elements required to manage digital image collections. [15] BagIt Hierarchical file packaging format for disk-based storage and network transfer of arbitrary digital content. The package consists of the payload and tags documenting the storage and transfer. One of the tag files contains a manifest listing every file in the payload together with its corresponding checksum. [16] 21

38

39 4 Design In the design process the aim was to define concrete strategies for implementing a system following the functional and nonfunctional requirements from the analytic part. For the needs of this thesis it was needed to define not only the functions performed inside of an Ingest process, but also infrastructure for receiving of SIP packages to the system and management and scheduling of Ingest processes. First, I created the system architecture capturing the main functional components of the system and their responsibilities. Second, I defined a domain model describing the data entities figuring in the system. The design process was carefully performed with the respect to the recent trends in the design of information systems and experience with other similar systems previously produced in InQool company. 4.1 Architecture of system One of the system requirements from the analytic part is a high degree of modularity. As ARCLib is a long term project, the requirements for particular functions can evolve over time. Because of that it was necessary to make the architecture as modular as possible so that it was easy to replace any module without affecting the rest of the system. I decided to design the system with the respect to the architecture SOA (Service Oriented Architecture). SOA belongs to the currently most popular design concepts to achieve module interchangeability in large systems. A system respecting its principles consists of several loosely coupled components communicating with each other using messages. Each individual component provides its public set of well documented interfaces. The functions accessible through these interfaces are known as web services. Next system requirement listed in the analysis is a horizontal scalability. The solution ARCLib needs to be applicable in systems of a small scale as well as in large systems with a high amount of incoming SIP packages. Requirements for a high degree of scalability lead to parallel processing. The architecture needs to support asynchronous execution wherever possible so that in case of need there could be cre- 23

40 4. Design ated multiple instances of the same process. These instances should be able to be executed using a distributed system consisting of separate machines as the system resources of a single machine are necessarily limited. The performance bottleneck of an OAIS system, especially of its functional entity Ingest, is the transformation of a SIP package to an AIP package. Processing of multiple simultaneous requests for a SIP ingestion imposes high demands on the system resources. Because of that, I decided to establish a separate SOA service for the process of the SIP to API transformation. This enables a parallel run of several SIP ingestions. Consequently, there must be another SOA service in the system for the coordination of messages sent to perform the particular SIP ingestions. The SOA principles are in ARCLib implemented using the technology Java Messaging Services (JMS). JMS solves the producer consumer problem[17] that relates to the requirement for asynchronous execution mentioned earlier. Producer-consumer problem Producer generates some data and puts it to a queue. Consumer subscribes to receive a message when new data is added to the queue and withdraws the data from the end of the queue. Producer cannot add data when the queue is full and consumer cannot withdraw data when there is no data in the queue. The producer is put to a sleep when the queue is full and it is awaken when some data is withdrawn. The same technique is implemented in the consumer when the queue is empty. The producer in the terminology of JMS is called JMS provider. In ARCLib it is represented by the system component Coordinator that coordinates the JMS communication from the user interface to the other system services. 24

41 4. Design Figure 4.1: Architecture of the ARCLib system Coordinator Coordinator is a JMS provider, in terms of the JMS standard a component responsible for managing of the queue of JMS messages. Coordinator communicates with other components of the system using short JMS messages that are transformed to the format compatible with the given web services. Messages containing a bigger volume of data like the SIP content are transferred between the particular system components directly bypassing the Coordinator. Coordinator is stateless, does not store any data about the request being processed. In case of need it can be duplicated, but it is supposed there will always be just a single instance in the system, because of the small size of the incoming messages. Worker Worker is a JMS service that executes the Ingest process for the provided SIP package. There are expected to be multiple instances of a Worker running in separate threads in parallel, possibly on different 25

42 4. Design machines. Worker instances are processing the queue of requests from the Coordinator. The speed of processing of the tasks in the queue is regulated by adding or subtracting the Worker instances in the set of all currently available Worker instances, or the Worker pool. Processing of a request means starting a BPM workflow that executes operations related to the Ingest process. During the workflow execution each Worker saves the current state of processing to the database. External data sources are shared data storages that must be accessible to all Worker instances. Workspace Workspace is a collective designation for all the storage facilities used during the Ingest process, namely the disk storage and relational database. Disk storage - represents the temporary storage where the SIP package is held over the time of the Ingest process. It can be a local disk or disk mapped on a virtual server as folder. Relational database - stores the system related data as information about the suppliers, users, user rights, system settings, definitions of validation, SIP and other profiles and more. API API consists of RESTful 1 interfaces of web services. These interfaces are accessible from the outside of the ARCLib system and provide the basic entry point to the logic of ARCLib. 1. Representational state transfer (REST) is a popular interface architecture designed for a distributed environment 26

43 4. Design ARCLib Index ARCLib Index serves for indexation of the metadata extracted from SIP packages during the Ingest process. It facilitates the lookup process in the archived data by making it more effective and enabling complex searches. Web client Web client is a thin web client for the needs of administration and searching in the indexed metadata. External client External client represents any client entity accessing the web services of ARCLib. Mail server Mail server is a server for the needs of user notifications about the results of Ingest processes. 4.2 Domain model When designing an information system it is always necessary to define a domain model that captures the data entities of the system together with the entity relationships. Domain model is rather a conceptual model than a detailed specification of the database entities, it does not specify the concrete attributes nor does the names of the entities. During the implementation process it serves as a starting point for creation of the data model. This model captures only the entities necessary for the OAIS Ingest process. 27

44 4. Design Figure 4.2: Domain model of the ARCLib system Location External location that system ARCLib can connect to. It stores the SIP packages before their ingestion to the archive. It can be represented by either a local storage or an external storage accessible using a transfer protocol. Tool Tool represents any external software used for the needs of Ingest process. It must be accessible in the form of a script and with the ability to be parametrized from inside of the ARCLib. 28

45 4. Design Validation Scheme XML Schema 2 used during the SIP validation. Validation Rule Validation rule is an elementary test of a property related to the SIP package that is used during the process of SIP validation. The result is either true or false depending on the given input. There are three types of validation rules: 1. Check of a file/part of a file against Validation scheme 2. Check of value or existence of an element/attribute using XPath and regular expressions 3. Check of existence of a file Validation Profile XML file consisting of several instances of Validation rule. Supplier Organization using the ARCLib that supplies the SIP packages to the system. Supplier Profile Association of a Validation Profile and Ingest Workflow to the Supplier. Supplier can define several supplier profiles for different types of SIP packages. SIP SIP package as it is received on the input to the archive. 2. XML Schema is a formal specification for description of XML documents 29

46 4. Design ARCLib AIP XML XML file associated with SIP that stores the necessary metadata information related to the SIP needed for its archival purposes. There can be several versions of ARCLib AIP XML files bound to a single SIP package. AIP Archival package consisting of SIP package and associated ARCLib AIP XML. Batch Set of SIP packages to be ingested to the archive accompanied by their associated Supplier profile. Ingest workflow Description of a BPM workflow of the Ingest process. Ingest workflow consists of a sequence of steps that are performed during the transformation of a SIP package to an AIP package. Parametrization of the Ingest workflow is configured using a configuration file. There are three levels of configuration: 1. Global configuration on the system level 2. Configuration on the level of Supplier 3. Configuration as the parameter of Batch Configuration provided with a batch has a priority over the configuration related to the supplier and the configuration by the supplier has a priority over the basic default configuration defined on the system level. Timed Job Timed job is a plan for an execution of the Ingest process for SIPs specified in Batch in a given time. 30

47 4. Design 4.3 Management of Ingest processes The life cycle of a SIP package from the moment it enters the system until the creation of the resulting AIP consists of these steps: 1. The ingestion of a SIP package to the system begins with copying the SIP contents from a Location to the Workspace. This is a temporary place where all the SIP packages are stored before their respective Ingest workflow is triggered. Usually there are multiple packages copied to the Workspace before a processing of any of the packages is started. 2. After all the required SIP packages have been transferred to the Workspace, the Coordinator receives a request containing a Batch. Batch specifies the set of SIPs for which the Ingest workflow should start. 3. Coordinator sends a JMS message to Worker for every SIP in the Batch. The message is added to a queue of JMS messages and stays there until an instance of Worker is available. As there is a limited maximal number of Worker instances, processing of the whole Batch can take an indefinite time. 4. Ingest workflow for a SIP starts in the moment when a free instance of Worker is assigned to the SIP. During the execution of a particular Ingest workflow any of the step can finish with a failure and cause the workflow to result with unsuccess. In the successful scenario, when the workflow finishes without an unrecoverable failure, the new AIP is saved to the system and the original SIP removed from the Workspace. SIP package throughout the process of its ingestion transitions between the following states: New, Processing, Processed / Failed. These states and the respective transitions are depicted in Figure

48 4. Design Figure 4.3: SIP package processing state diagram Batch management For a high degree of parallelism the system enables not only several SIP ingestions running in parallel within a single batch, but also a parallel execution of multiple batches. Coordinator entity provides the following web services related to batch management: 32 Start triggers ingest workflows for the SIP packages in batch Suspend suspends execution of batch, already started ingestion workflows continue until the end, no new ingest workflows will be started Cancel cancels execution of batch, already started ingest workflows continue until the end, no new ingest workflows will be started

49 4. Design Resume resumes previously suspended batch, ingest workflows for unprocessed SIP packages are started automatically Figure 4.4 shows the states in the life cycle of Batch. Moreover, it models the transitions that are triggered by using the abovementioned web services or by changes of the internal state of Batch and finally, the guard conditions attached to the transitions. Figure 4.4: Batch processing state diagram 33

50 4. Design An instance of Batch can transition between the states: Processing execution of batch has started and some of the SIP packages belonging the batch are in the state Processing Suspended batch has been suspended on request of Supplier, the batch can be resumed again Processed ingest workflows of all SIP packages of the batch have either successfully or unsuccessfully finished Canceled execution of batch has been canceled on request of the Supplier or because of multiple failures of SIP ingest workflows, batch is canceled automatically in the moment when more than half of SIPs belonging the batch have finished with failure, the batch cannot be resumed again 34

51 5 Implementation of prototypes Because the scope of this thesis is limited to the OAIS Ingest entity instead of designing all the archival system, I decided to demonstrate the implementation on functional prototypes. These prototypes cover the key functionalities required from a system implementing Ingest. The source code of the prototypes together with a description of the configuration of an environment are available in Appendix A. This chapter describes the technologies used in the implementation followed the description of prototypes themselves. 5.1 Technologies Apache Maven For a build automation of the prototypes I used Apache Maven that is among the most popular build automation technologies for the language Java. Maven enables to describe the build procedure of a programming code and performs automatic fetching of resources from external repositories, called dependencies, that is particularly usefull in projects of a large scope Spring Framework Spring Framework is a Java-based open-source application framework for building web and enterprise applications. The goal of Spring is to be an universal solution providing all the necessary features for the needs of business applications compared to other tools that mostly focus only at a single particular area. To the main traits of Spring belong a high configurability and customizability Spring Boot With the increasing number of features that Spring provides, the configuring of Spring application became tedious and error-prone. This situation initiated the Spring team [18] to create SpringBoot to address the complexity of configuration. 35

52 5. Implementation of prototypes SpringBoot provides preconfigured templates for many types of projects that serve as a starting point for creating own solutions. It also enables easy incorporation of third party technologies by publishing its own version of external libraries accessible in the SpringBoot repository that are specially adjusted for the usage in a SpringBoot project. One of the decisions made at the beginning of the prototypes development was to use the SpringBoot wherever it is possible Camunda BPM Camunda BPM is an open-source workflow management system that defines and executes business processes in BPMN JBoss Hibernate Hibernate is a framework that enables object-relational mapping (ORM). ORM facilitates the process of transformation of object entities to database objects by automatizing the tasks normaly needed to be done manually by the programmer. It is one of the implementations of Java Persistence API (JPA) Apache ActiveMQ Artemis Apache ActiveMQ Artemis is an open-source technology for a high performance asynchronous messaging between components of a distributed system. It is one of the implementations of Java Message Service (JMS). Artemis serves in the communication as a transfer agent called message broker. To the advanages of using a message broker belong a high scalability of the communication and ability to withstand high loads of messages while keeping low latencies. 1. Business Process Model and Notation (BPMN) in the version released in January JPA is an internal specification for the object/relational mapping in the language Java 36

53 5.2 Management of Ingest processes 5. Implementation of prototypes This prototype implements the Management of Ingest processes described in the chapter of design. The Coordinator entity is represented by CoordinatorService class, Worker entity by WorkerService class and Ingest workflow entity by IngestBpmDelegate class. Figure 5.1 depicts the sequence of messages sent between these classes during processing of an instance of Batch consisting of multiple SIP packages. Figure 5.1: Batch processing sequence diagram All messages are distributed by means of JMS messages. The JMS communication is implemented with the Apache ActiveMQ Artemis framework. Artemis enables an asynchronous executions of the WorkerService instances as each instance occupies its own thread. During the process of implementation I stumbled upon a number of problems all of which were eventually resolved. The most difficult part was the synchronization of JMS communication and database 37

Assessment of product against OAIS compliance requirements

Assessment of product against OAIS compliance requirements Assessment of product against OAIS compliance requirements Product name: Archivematica Date of assessment: 30/11/2013 Vendor Assessment performed by: Evelyn McLellan (President), Artefactual Systems Inc.

More information

Defining OAIS requirements by Deconstructing the OAIS Reference Model Date last revised: August 28, 2005

Defining OAIS requirements by Deconstructing the OAIS Reference Model Date last revised: August 28, 2005 Defining OAIS requirements by Deconstructing the OAIS Reference Model Date last revised: August 28, 2005 This table includes text extracted directly from the OAIS reference model (Blue Book, 2002 version)

More information

The OAIS Reference Model: current implementations

The OAIS Reference Model: current implementations The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath m.day@ukoln.ac.uk Chinese-European Workshop on Digital Preservation, Beijing, China, 14-16 July 2004 Presentation

More information

Siebel Application Deployment Manager Guide. Version 8.0, Rev. A April 2007

Siebel Application Deployment Manager Guide. Version 8.0, Rev. A April 2007 Siebel Application Deployment Manager Guide Version 8.0, Rev. A April 2007 Copyright 2005, 2006, 2007 Oracle. All rights reserved. The Programs (which include both the software and documentation) contain

More information

Assessment of product against OAIS compliance requirements

Assessment of product against OAIS compliance requirements Assessment of product against OAIS compliance requirements Product name: Archivematica Sources consulted: Archivematica Documentation Date of assessment: 19/09/2013 Assessment performed by: Christopher

More information

Comparing Open Source Digital Library Software

Comparing Open Source Digital Library Software Comparing Open Source Digital Library Software George Pyrounakis University of Athens, Greece Mara Nikolaidou Harokopio University of Athens, Greece Topic: Digital Libraries: Design and Development, Open

More information

FDA Affiliate s Guide to the FDA User Interface

FDA Affiliate s Guide to the FDA User Interface FDA Affiliate s Guide to the FDA User Interface Version 1.2 July 2014 Superseded versions: FDA Affiliate s Guide to the FDA User Interface. Version 1.0, July 2012. FDA Affiliate s Guide to the FDA User

More information

Business Processes for Managing Engineering Documents & Related Data

Business Processes for Managing Engineering Documents & Related Data Business Processes for Managing Engineering Documents & Related Data The essence of good information management in engineering is Prevention of Mistakes Clarity, Accuracy and Efficiency in Searching and

More information

DAITSS Workflow Interface

DAITSS Workflow Interface Chapter 6: DAITSS Workflow Interface Topics covered in this chapter: A brief glossary A DAITSS workflow diagram DAITSS Processing Status vs. Operations Events Suggested workflow steps for managing repository

More information

User Stories : Digital Archiving of UNHCR EDRMS Content. Prepared for UNHCR Open Preservation Foundation, May 2017 Version 0.5

User Stories : Digital Archiving of UNHCR EDRMS Content. Prepared for UNHCR Open Preservation Foundation, May 2017 Version 0.5 User Stories : Digital Archiving of UNHCR EDRMS Content Prepared for UNHCR Open Preservation Foundation, May 2017 Version 0.5 Introduction This document presents the user stories that describe key interactions

More information

Building for the Future

Building for the Future Building for the Future The National Digital Newspaper Program Deborah Thomas US Library of Congress DigCCurr 2007 Chapel Hill, NC April 19, 2007 1 What is NDNP? Provide access to historic newspapers Select

More information

Sparta Systems TrackWise Digital Solution

Sparta Systems TrackWise Digital Solution Systems TrackWise Digital Solution 21 CFR Part 11 and Annex 11 Assessment February 2018 Systems TrackWise Digital Solution Introduction The purpose of this document is to outline the roles and responsibilities

More information

B2SAFE metadata management

B2SAFE metadata management B2SAFE metadata management version 1.2 by Claudio Cacciari, Robert Verkerk, Adil Hasan, Elena Erastova Introduction The B2SAFE service provides a set of functions for long term bit stream data preservation:

More information

Archival Information Package (AIP) E-ARK AIP version 1.0

Archival Information Package (AIP) E-ARK AIP version 1.0 Archival Information Package (AIP) E-ARK AIP version 1.0 January 27 th 2017 Page 1 of 50 Executive Summary This AIP format specification is based on E-ARK deliverable, D4.4 Final version of SIP-AIP conversion

More information

Different Aspects of Digital Preservation

Different Aspects of Digital Preservation Different Aspects of Digital Preservation DCH-RP and EUDAT Workshop in Stockholm 3rd of June 2014 Börje Justrell Table of Content Definitions Strategies The Digital Archive Lifecycle 2 Digital preservation

More information

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM OMB No. 3137 0071, Exp. Date: 09/30/2015 DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM Introduction: IMLS is committed to expanding public access to IMLS-funded research, data and other digital products:

More information

DAITSS Demo Virtual Machine Quick Start Guide

DAITSS Demo Virtual Machine Quick Start Guide DAITSS Demo Virtual Machine Quick Start Guide The following topics are covered in this document: A brief Glossary Downloading the DAITSS Demo Virtual Machine Starting up the DAITSS Demo Virtual Machine

More information

Session Two: OAIS Model & Digital Curation Lifecycle Model

Session Two: OAIS Model & Digital Curation Lifecycle Model From the SelectedWorks of Group 4 SundbergVernonDhaliwal Winter January 19, 2016 Session Two: OAIS Model & Digital Curation Lifecycle Model Dr. Eun G Park Available at: https://works.bepress.com/group4-sundbergvernondhaliwal/10/

More information

Fusion Registry 9 SDMX Data and Metadata Management System

Fusion Registry 9 SDMX Data and Metadata Management System Registry 9 Data and Management System Registry 9 is a complete and fully integrated statistical data and metadata management system using. Whether you require a metadata repository supporting a highperformance

More information

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Joesph JaJa joseph@ Mike Smorul toaster@ Fritz McCall fmccall@ Yang Wang wpwy@ Institute

More information

Transfers and Preservation of E-archives at the National Archives of Sweden

Transfers and Preservation of E-archives at the National Archives of Sweden Transfers and Preservation of E-archives at the National Archives of Sweden Magnus Geber, the National Archives of Sweden Summary The National Archives of Sweden have received transfers of digital records

More information

Analysis Exchange Framework Terms of Reference December 2016

Analysis Exchange Framework Terms of Reference December 2016 Analysis Exchange Framework Terms of Reference December 2016 Approved for Public Release; Distribution Unlimited. Case Number 16-4653 The views, opinions and/or findings contained in this report are those

More information

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version University of British Columbia Library Persistent Digital Collections Implementation Plan Final project report Summary version May 16, 2012 Prepared by 1. Introduction In 2011 Artefactual Systems Inc.

More information

irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam

irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam Agenda Introduction Challenges Data Transfer Solution irods use in Data Transfer Solution irods Proof-of-Concept Q&A Introduction

More information

Europeana Core Service Platform

Europeana Core Service Platform Europeana Core Service Platform DELIVERABLE D7.1: Strategic Development Plan, Architectural Planning Revision Final Date of submission 30 October 2015 Author(s) Marcin Werla, PSNC Pavel Kats, Europeana

More information

Part II. Integration Use Cases

Part II. Integration Use Cases Part II Integration Use Cases Achieving One Version of the Truth requires integration between the data synchronization application environment (especially the local trade item catalog) and enterprise applications

More information

Digital Preservation DMFUG 2017

Digital Preservation DMFUG 2017 Digital Preservation DMFUG 2017 1 The need, the goal, a tutorial In 2000, the University of California, Berkeley estimated that 93% of the world's yearly intellectual output is produced in digital form

More information

3D Visualization. Requirements Document. LOTAR International, Visualization Working Group ABSTRACT

3D Visualization. Requirements Document. LOTAR International, Visualization Working Group ABSTRACT 3D Visualization Requirements Document LOTAR International, Visualization Working Group ABSTRACT The purpose of this document is to provide the list of requirements and their associated priorities related

More information

1Z0-560 Oracle Unified Business Process Management Suite 11g Essentials

1Z0-560 Oracle Unified Business Process Management Suite 11g Essentials 1Z0-560 Oracle Unified Business Process Management Suite 11g Essentials Number: 1Z0-560 Passing Score: 650 Time Limit: 120 min File Version: 1.0 http://www.gratisexam.com/ 1Z0-560: Oracle Unified Business

More information

Oracle Enterprise Data Quality

Oracle Enterprise Data Quality Oracle Enterprise Data Quality Architecture Guide Version 9.0 January 2012 Copyright 2006, 2012, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality, version 9.0 Copyright

More information

European Conference on Quality and Methodology in Official Statistics (Q2008), 8-11, July, 2008, Rome - Italy

European Conference on Quality and Methodology in Official Statistics (Q2008), 8-11, July, 2008, Rome - Italy European Conference on Quality and Methodology in Official Statistics (Q2008), 8-11, July, 2008, Rome - Italy Metadata Life Cycle Statistics Portugal Isabel Morgado Methodology and Information Systems

More information

PRINCIPLES AND FUNCTIONAL REQUIREMENTS

PRINCIPLES AND FUNCTIONAL REQUIREMENTS INTERNATIONAL COUNCIL ON ARCHIVES PRINCIPLES AND FUNCTIONAL REQUIREMENTS FOR RECORDS IN ELECTRONIC OFFICE ENVIRONMENTS RECORDKEEPING REQUIREMENTS FOR BUSINESS SYSTEMS THAT DO NOT MANAGE RECORDS OCTOBER

More information

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Metadata and Encoding Standards for Digital Initiatives: An Introduction Metadata and Encoding Standards for Digital Initiatives: An Introduction Maureen P. Walsh, The Ohio State University Libraries KSU-SLIS Organization of Information 60002-004 October 29, 2007 Part One Non-MARC

More information

Agenda. Bibliography

Agenda. Bibliography Humor 2 1 Agenda 3 Trusted Digital Repositories (TDR) definition Open Archival Information System (OAIS) its relevance to TDRs Requirements for a TDR Trustworthy Repositories Audit & Certification: Criteria

More information

The Choice For A Long Term Digital Preservation System or why the IISH favored Archivematica

The Choice For A Long Term Digital Preservation System or why the IISH favored Archivematica The Choice For A Long Term Digital Preservation System or why the IISH favored Archivematica At the beginning of 2017 the IISH decided to use Archivematica as a central system for long term preservation

More information

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS).

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS). Harvard University Library Office for Information Systems DRS Policy Guide This Guide defines the policies associated with the Harvard Library Digital Repository Service (DRS) and is intended for Harvard

More information

ActiveVOS Technologies

ActiveVOS Technologies ActiveVOS Technologies ActiveVOS Technologies ActiveVOS provides a revolutionary way to build, run, manage, and maintain your business applications ActiveVOS is a modern SOA stack designed from the top

More information

Its All About The Metadata

Its All About The Metadata Best Practices Exchange 2013 Its All About The Metadata Mark Evans - Digital Archiving Practice Manager 11/13/2013 Agenda Why Metadata is important Metadata landscape A flexible approach Case study - KDLA

More information

Automatic Merging of Specification Documents in a Parallel Development Environment

Automatic Merging of Specification Documents in a Parallel Development Environment Automatic Merging of Specification Documents in a Parallel Development Environment Rickard Böttcher Linus Karnland Department of Computer Science Lund University, Faculty of Engineering December 16, 2008

More information

User Scripting April 14, 2018

User Scripting April 14, 2018 April 14, 2018 Copyright 2013, 2018, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and

More information

Active Endpoints. ActiveVOS Platform Architecture Active Endpoints

Active Endpoints. ActiveVOS Platform Architecture Active Endpoints Active Endpoints ActiveVOS Platform Architecture ActiveVOS Unique process automation platforms to develop, integrate, and deploy business process applications quickly User Experience Easy to learn, use

More information

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid, Barbara Sierman Digital Preservation Officer Madrid, 16-03-2006 e-depot in practice Short introduction of the e-depot 4 Cases with different aspects Characteristics of the supplier Specialities, problems

More information

Reference Requirements for Records and Documents Management

Reference Requirements for Records and Documents Management Reference Requirements for Records and Documents Management Ricardo Jorge Seno Martins ricardosenomartins@gmail.com Instituto Superior Técnico, Lisboa, Portugal May 2015 Abstract When information systems

More information

CCSDS STANDARDS A Reference Model for an Open Archival Information System (OAIS)

CCSDS STANDARDS A Reference Model for an Open Archival Information System (OAIS) CCSDS STANDARDS A Reference Model for an Open Archival System (OAIS) Mr. Nestor Peccia European Space Operations Centre, Robert-Bosch-Str. 5, D-64293 Darmstadt, Germany. Phone +49 6151 902431, Fax +49

More information

Integration With the Business Modeler

Integration With the Business Modeler Decision Framework, J. Duggan Research Note 11 September 2003 Evaluating OOA&D Functionality Criteria Looking at nine criteria will help you evaluate the functionality of object-oriented analysis and design

More information

TIBCO Complex Event Processing Evaluation Guide

TIBCO Complex Event Processing Evaluation Guide TIBCO Complex Event Processing Evaluation Guide This document provides a guide to evaluating CEP technologies. http://www.tibco.com Global Headquarters 3303 Hillview Avenue Palo Alto, CA 94304 Tel: +1

More information

OAIS: What is it and Where is it Going?

OAIS: What is it and Where is it Going? OAIS: What is it and Where is it Going? Presentation on the Reference Model for an Open Archival System (OAIS) Don Sawyer/NASA/GSFC Lou Reich/NASA/CSC FAFLRT/ALA FAFLRT/ALA 1 Organizational Background

More information

Conch Appendix: Discovery Questionnaire. Questionnaire Summary

Conch Appendix: Discovery Questionnaire. Questionnaire Summary Conch Appendix: Discovery Questionnaire Project Acronym: PREFORMA Grant Agreement number: 619568 Project Title: PREservation FORMAts for culture information/e-archives Prepared by: MediaArea.net SARL Erik

More information

Protection of the National Cultural Heritage in Austria

Protection of the National Cultural Heritage in Austria Protection of the National Cultural Heritage in Austria Mag. Protection notice / Copyright notice The Domesday Book Domesday Book A survey of England completed 1086 and still readable National Archives

More information

An overview of the OAIS and Representation Information

An overview of the OAIS and Representation Information An overview of the OAIS and Representation Information JORUM, DCC and JISC Forum Long-term Curation and Preservation of Learning Objects February 9 th 2006 University of Glasgow Manjula Patel UKOLN and

More information

SciX Open, self organising repository for scientific information exchange. D15: Value Added Publications IST

SciX Open, self organising repository for scientific information exchange. D15: Value Added Publications IST IST-2001-33127 SciX Open, self organising repository for scientific information exchange D15: Value Added Publications Responsible author: Gudni Gudnason Co-authors: Arnar Gudnason Type: software/pilot

More information

Document Title Ingest Guide for University Electronic Records

Document Title Ingest Guide for University Electronic Records Digital Collections and Archives, Manuscripts & Archives, Document Title Ingest Guide for University Electronic Records Document Number 3.1 Version Draft for Comment 3 rd version Date 09/30/05 NHPRC Grant

More information

A Collaboration Model between Archival Systems to Enhance the Reliability of Preservation by an Enclose-and-Deposit Method

A Collaboration Model between Archival Systems to Enhance the Reliability of Preservation by an Enclose-and-Deposit Method A Collaboration Model between Archival Systems to Enhance the Reliability of Preservation by an Enclose-and-Deposit Method Koichi Tabata, Takeshi Okada, Mitsuharu Nagamori, Tetsuo Sakaguchi, and Shigeo

More information

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...)

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...) Technical issues 1 Slide 1 & 2 Technical issues There are a wide variety of technical issues related to starting up an IR. I m not a technical expert, so I m going to cover most of these in a fairly superficial

More information

Archivists Toolkit: Description Functional Area

Archivists Toolkit: Description Functional Area : Description Functional Area Outline D1: Overview D2: Resources D2.1: D2.2: D2.3: D2.4: D2.5: D2.6: D2.7: Description Business Rules Required and Optional Tasks Sequences User intentions / Application

More information

Managing Learning Objects in Large Scale Courseware Authoring Studio 1

Managing Learning Objects in Large Scale Courseware Authoring Studio 1 Managing Learning Objects in Large Scale Courseware Authoring Studio 1 Ivo Marinchev, Ivo Hristov Institute of Information Technologies Bulgarian Academy of Sciences, Acad. G. Bonchev Str. Block 29A, Sofia

More information

Archivematica user instructions

Archivematica user instructions Archivematica 0.7.1 user instructions You are free to copy, redistribute or repurpose this work under the terms of the Creative Commons Attribution-Share Alike Canada 2.5 license. 1 Table of Contents 1.

More information

Red Hat JBoss Data Virtualization 6.3 Glossary Guide

Red Hat JBoss Data Virtualization 6.3 Glossary Guide Red Hat JBoss Data Virtualization 6.3 Glossary Guide David Sage Nidhi Chaudhary Red Hat JBoss Data Virtualization 6.3 Glossary Guide David Sage dlesage@redhat.com Nidhi Chaudhary nchaudha@redhat.com Legal

More information

}w!"#$%&'()+,-./012345<ya

}w!#$%&'()+,-./012345<ya Masaryk University Faculty of Informatics }w!"#$%&'()+,-./012345

More information

L3.4. Data Management Techniques. Frederic Desprez Benjamin Isnard Johan Montagnat

L3.4. Data Management Techniques. Frederic Desprez Benjamin Isnard Johan Montagnat Grid Workflow Efficient Enactment for Data Intensive Applications L3.4 Data Management Techniques Authors : Eddy Caron Frederic Desprez Benjamin Isnard Johan Montagnat Summary : This document presents

More information

Liberate, a component-based service orientated reporting architecture

Liberate, a component-based service orientated reporting architecture Paper TS05 PHUSE 2006 Liberate, a component-based service orientated reporting architecture Paragon Global Services Ltd, Huntingdon, U.K. - 1 - Contents CONTENTS...2 1. ABSTRACT...3 2. INTRODUCTION...3

More information

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland Dr Aileen O Carroll Policy Manager Digital Repository of Ireland

More information

BHL-EUROPE: Biodiversity Heritage Library for Europe. Jana Hoffmann, Henning Scholz

BHL-EUROPE: Biodiversity Heritage Library for Europe. Jana Hoffmann, Henning Scholz Nimis P. L., Vignes Lebbe R. (eds.) Tools for Identifying Biodiversity: Progress and Problems pp. 43-48. ISBN 978-88-8303-295-0. EUT, 2010. BHL-EUROPE: Biodiversity Heritage Library for Europe Jana Hoffmann,

More information

XML information Packaging Standards for Archives

XML information Packaging Standards for Archives XML information Packaging Standards for Archives Lou Reich/CSC Long Term Knowledge Retention Workshop March15,2006 15 March 2006 1 XML Packaging Standards Growing interest in XML-based representation of

More information

Preservation Planning in the OAIS Model

Preservation Planning in the OAIS Model Preservation Planning in the OAIS Model Stephan Strodl and Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology {strodl, rauber}@ifs.tuwien.ac.at Abstract

More information

An Introduction to PREMIS. Jenn Riley Metadata Librarian IU Digital Library Program

An Introduction to PREMIS. Jenn Riley Metadata Librarian IU Digital Library Program An Introduction to PREMIS Jenn Riley Metadata Librarian IU Digital Library Program Outline Background and context PREMIS data model PREMIS data dictionary Implementing PREMIS Adoption and ongoing developments

More information

T2/T2S CONSOLIDATION USER REQUIREMENTS DOCUMENT SHARED SERVICES (SHRD) FOR

T2/T2S CONSOLIDATION USER REQUIREMENTS DOCUMENT SHARED SERVICES (SHRD) FOR T2/T2S CONSOLIDATION USER REQUIREMENTS DOCUMENT FOR SHARED SERVICES (SHRD) Version: 1.0 Status: FINAL Date: 06/12/2017 Contents 1 EUROSYSTEM SINGLE MARKET INFRASTRUCTURE GATEWAY (ESMIG)... 6 1.1 Overview...

More information

IBM BigFix Compliance PCI Add-on Version 9.5. Payment Card Industry Data Security Standard (PCI DSS) User's Guide IBM

IBM BigFix Compliance PCI Add-on Version 9.5. Payment Card Industry Data Security Standard (PCI DSS) User's Guide IBM IBM BigFix Compliance PCI Add-on Version 9.5 Payment Card Industry Data Security Standard (PCI DSS) User's Guide IBM IBM BigFix Compliance PCI Add-on Version 9.5 Payment Card Industry Data Security Standard

More information

PRESERVING DIGITAL OBJECTS

PRESERVING DIGITAL OBJECTS MODULE 12 PRESERVING DIGITAL OBJECTS Erin O Meara and Kate Stratton Preserving Digital Objects 51 Case Study 2: University of North Carolina Chapel Hill By Jill Sexton, former Head of Digital Research

More information

1 Executive Overview The Benefits and Objectives of BPDM

1 Executive Overview The Benefits and Objectives of BPDM 1 Executive Overview The Benefits and Objectives of BPDM This is an excerpt from the Final Submission BPDM document posted to OMG members on November 13 th 2006. The full version of the specification will

More information

BUILDING MICROSERVICES ON AZURE. ~ Vaibhav

BUILDING MICROSERVICES ON AZURE. ~ Vaibhav BUILDING MICROSERVICES ON AZURE ~ Vaibhav Gujral @vabgujral About Me Over 11 years of experience Working with Assurant Inc. Microsoft Certified Azure Architect MCSD, MCP, Microsoft Specialist Aspiring

More information

Wendy Thomas Minnesota Population Center NADDI 2014

Wendy Thomas Minnesota Population Center NADDI 2014 Wendy Thomas Minnesota Population Center NADDI 2014 Coverage Problem statement Why are there problems with interoperability with external search, storage and delivery systems Minnesota Population Center

More information

Data Curation Handbook Steps

Data Curation Handbook Steps Data Curation Handbook Steps By Lisa R. Johnston Preliminary Step 0: Establish Your Data Curation Service: Repository data curation services should be sustained through appropriate staffing and business

More information

Avid Interplay Production Web Services Version 2.0

Avid Interplay Production Web Services Version 2.0 Avid Interplay Production Web Services Version 2.0 www.avid.com Table of Contents Overview... 1 Interplay Production Web Services Functionality... 2 Asset Management... 2 Workflow Enhancement... 3 Infrastructure...

More information

Writing a Data Management Plan A guide for the perplexed

Writing a Data Management Plan A guide for the perplexed March 29, 2012 Writing a Data Management Plan A guide for the perplexed Agenda Rationale and Motivations for Data Management Plans Data and data structures Metadata and provenance Provisions for privacy,

More information

A Model for Managing Digital Pictures of the National Archives of Iran Based on the Open Archival Information System Reference Model

A Model for Managing Digital Pictures of the National Archives of Iran Based on the Open Archival Information System Reference Model A Model for Managing Digital Pictures of the National Archives of Iran Based on the Open Archival Information System Reference Model Saeed Rezaei Sharifabadi, Mansour Tajdaran and Zohreh Rasouli Alzahra

More information

NEDLIB LB5648 Mapping Functionality of Off-line Archiving and Provision Systems to OAIS

NEDLIB LB5648 Mapping Functionality of Off-line Archiving and Provision Systems to OAIS NEDLIB LB5648 Mapping Functionality of Off-line Archiving and Provision Systems to OAIS Name of Client: Distribution List: Author: Authorised by: European Commission NEDLIB project Berkemeyer, Jörg DDB

More information

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group

More information

Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing Trusted Digital Repositories

Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing Trusted Digital Repositories Purdue University Purdue e-pubs Libraries Faculty and Staff Presentations Purdue Libraries 2015 Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing

More information

TIBCO Business Studio - Analyst Edition Installation

TIBCO Business Studio - Analyst Edition Installation TIBCO Business Studio - Analyst Edition Installation Software Release 4.1 May 2016 Two-Second Advantage 2 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED

More information

Tivoli Application Dependency Discovery Manager Version 7.3. Discovery Library Adapter Developer's Guide IBM

Tivoli Application Dependency Discovery Manager Version 7.3. Discovery Library Adapter Developer's Guide IBM Tivoli Application Dependency Discovery Manager Version 7.3 Discovery Library Adapter Developer's Guide IBM Tivoli Application Dependency Discovery Manager Version 7.3 Discovery Library Adapter Developer's

More information

Research Data Repository Interoperability Primer

Research Data Repository Interoperability Primer Research Data Repository Interoperability Primer The Research Data Repository Interoperability Working Group will establish standards for interoperability between different research data repository platforms

More information

A web application serving queries on renewable energy sources and energy management topics database, built on JSP technology

A web application serving queries on renewable energy sources and energy management topics database, built on JSP technology International Workshop on Energy Performance and Environmental 1 A web application serving queries on renewable energy sources and energy management topics database, built on JSP technology P.N. Christias

More information

Solution Brief: Archiving with Harmonic Media Application Server and ProXplore

Solution Brief: Archiving with Harmonic Media Application Server and ProXplore Solution Brief: Archiving with Harmonic Media Application Server and ProXplore Summary Harmonic Media Application Server (MAS) provides management of content across the Harmonic server and storage infrastructure.

More information

Context-Awareness and Adaptation in Distributed Event-Based Systems

Context-Awareness and Adaptation in Distributed Event-Based Systems Context-Awareness and Adaptation in Distributed Event-Based Systems Eduardo S. Barrenechea, Paulo S. C. Alencar, Rolando Blanco, Don Cowan David R. Cheriton School of Computer Science University of Waterloo

More information

SHORT NOTES / INTEGRATION AND MESSAGING

SHORT NOTES / INTEGRATION AND MESSAGING SHORT NOTES / INTEGRATION AND MESSAGING 1. INTEGRATION and MESSAGING is related to HOW to SEND data to and receive from ANOTHER SYSTEM or APPLICATION 2. A WEB SERVICE is a piece of software designed to

More information

dysect DICOM Conformance Statement dysect DICOM Conformance Statement

dysect DICOM Conformance Statement dysect DICOM Conformance Statement dysect DICOM Conformance Statement 1 dysect DICOM Conformance Statement (041-00-0007 H) dysect Conformance Statement.doc DeJarnette Research Systems, Inc. 401 Washington Avenue, Suite 1010 Towson, Maryland

More information

Server software accepts requests for data from client software and returns the results to the client

Server software accepts requests for data from client software and returns the results to the client Client Server Model Introduction Client machines are generally single-user workstations providing a user-friendly interface to the end user. Each server provides a set of shared services to the clients.it

More information

Building a Digital Repository on a Shoestring Budget

Building a Digital Repository on a Shoestring Budget Building a Digital Repository on a Shoestring Budget Christinger Tomer University of Pittsburgh! PALA September 30, 2014 A version this presentation is available at http://www.pitt.edu/~ctomer/shoestring/

More information

Dynamic Database Schemas and Multi-Paradigm Persistence Transformations

Dynamic Database Schemas and Multi-Paradigm Persistence Transformations Grand Valley State University ScholarWorks@GVSU Technical Library School of Computing and Information Systems 2017 Dynamic Database Schemas and Multi-Paradigm Persistence Transformations Ryan D. Norton

More information

Sparta Systems Stratas Solution

Sparta Systems Stratas Solution Systems Solution 21 CFR Part 11 and Annex 11 Assessment October 2017 Systems Solution Introduction The purpose of this document is to outline the roles and responsibilities for compliance with the FDA

More information

On the Creation & Discovery of Topics in Distributed Publish/Subscribe systems

On the Creation & Discovery of Topics in Distributed Publish/Subscribe systems On the Creation & Discovery of Topics in Distributed Publish/Subscribe systems Shrideep Pallickara, Geoffrey Fox & Harshawardhan Gadgil Community Grids Lab, Indiana University 1 Messaging Systems Messaging

More information

Sparta Systems TrackWise Solution

Sparta Systems TrackWise Solution Systems Solution 21 CFR Part 11 and Annex 11 Assessment October 2017 Systems Solution Introduction The purpose of this document is to outline the roles and responsibilities for compliance with the FDA

More information

Content Management for the Defense Intelligence Enterprise

Content Management for the Defense Intelligence Enterprise Gilbane Beacon Guidance on Content Strategies, Practices and Technologies Content Management for the Defense Intelligence Enterprise How XML and the Digital Production Process Transform Information Sharing

More information

D WSMO Data Grounding Component

D WSMO Data Grounding Component Project Number: 215219 Project Acronym: SOA4All Project Title: Instrument: Thematic Priority: Service Oriented Architectures for All Integrated Project Information and Communication Technologies Activity

More information

Interoperability & Archives in the European Commission

Interoperability & Archives in the European Commission Interoperability & Archives in the European Commission By Natalia ARISTIMUÑO PEREZ Head of Interoperability Unit at Directorate- General for Informatics (DG DIGIT) European Commission High value added

More information

Interstage Business Process Manager Analytics V12.0 Studio Guide

Interstage Business Process Manager Analytics V12.0 Studio Guide Interstage Business Process Manager Analytics V12.0 Studio Guide Windows/Linux January 2012 Studio Guide Trademarks Trademarks of other companies are used in this documentation only to identify particular

More information

RODA-in. A generic tool for the mass creation of Submission Information Packages

RODA-in. A generic tool for the mass creation of Submission Information Packages RODA-in A generic tool for the mass creation of Submission Information Packages José Carlos Ramalho Dep. Informatics University of Minho jcr@di.uminho.pt André Pereira Dep. Informatics University of Minho

More information

White paper Selecting the right method

White paper Selecting the right method White paper Selecting the right method This whitepaper outlines how to apply the proper OpenText InfoArchive method to balance project requirements with source application architectures. Contents The four

More information

Persistent identifiers, long-term access and the DiVA preservation strategy

Persistent identifiers, long-term access and the DiVA preservation strategy Persistent identifiers, long-term access and the DiVA preservation strategy Eva Müller Electronic Publishing Centre Uppsala University Library, http://publications.uu.se/epcentre/ 1 Outline DiVA project

More information