Introducing Fedora 4. Overview, examples, and features. David Wilcox,

Similar documents
Powering Linked Open Data Applications

Fedora, Islandora, & Samvera: Requirements and Gaps. David Wilcox,

Fedora Commons Update

DuraSpace FAIRness and GDPR

Richard Marciano Alexandra Chassanoff David Pcolar Bing Zhu Chien-Yi Hu. March 24, 2010

Fedora Commons: Taking on the Challenge of the Next Generation of Scholarly Communication

Islandora and Fedora 4; The Atonement v3: The Atonermenter

Data Governance for the Connected Enterprise

Ing. José A. Mejía Villar M.Sc. Computing Center of the Alfred Wegener Institute for Polar and Marine Research

Working with Islandora

Semantic Integration with Apache Jena and Apache Stanbol

Drupal for Virtual Learning And Higher Education

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

David Minor UC San Diego Library Chronopolis Preservation Network

Introduction to Federico 2.0 and Fedora Commons

Incremental Export of Relational Database Contents into RDF Graphs

The RMap Project: Linking the Products of Research and Scholarly Communication Tim DiLauro

One Body, Many Heads for Repository-Powered Library Applications

Comparing Open Source Digital Library Software

COMPUTER AND INFORMATION SCIENCE JENA DB. Group Abhishek Kumar Harshvardhan Singh Abhisek Mohanty Suhas Tumkur Chandrashekhara

Medici for Digital Cultural Heritage Libraries. George Tsouloupas, PhD The LinkSCEEM Project

Study Guide. MarkLogic Professional Certification. Taking a Written Exam. General Preparation. Developer Written Exam Guide

Towards Open Innovation with Open Data Service Platform

Introduction to NoSQL Databases

MarkLogic 8 Overview of Key Features COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Apache Marmotta. Multimedia Management

ISO Self-Assessment at the British Library. Caylin Smith Repository

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017

Inventory (input to ECOMP and ONAP Roadmaps)

BPMN Processes for machine-actionable DMPs

Understanding the workplace of the future. Artificial Intelligence series

What s New in DPLA Technology? Mark A. Matienzo Director of Technology, DPLA DPLAFest Indianapolis, IN April 17, 2015

Triple Stores in a Nutshell

A Comparative Analysis of Linked Data Tools to Support Architectural Knowledge

Digital Preservation Efforts at UNLV Libraries

Long-term digital preservation of UNSWorks

Statistics Requirements for Release 5.2

An Archiving System for Managing Evolution in the Data Web

Trials And Tribulations Of Moving Forward With Digital Preservation Workflows And Strategies

The More We Get Together... The Islandora Community

Applying Ontologies in the Dairy Farming Domain for Big Data Analysis

SEAD Data Services. Jim Best Practices in Data Infrastructure Workshop. Cooperative agreement #OCI

Simile Tools Workshop Summary MacKenzie Smith, MIT Libraries

Building on to the Digital Preservation Foundation at Harvard Library. Andrea Goethals ABCD-Library Meeting June 27, 2016

Webinar Annotate data in the EUDAT CDI

RDF: Resource Description Failures and Linked Data Letdowns

BSC Smart Cities Initiative

ATC An OSGI-based Semantic Information Broker for Smart Environments. Paolo Azzoni Research Project Manager

Preserving Digital Content at Scale

DSpace Fedora. Eprints Greenstone. Handle System

HydraDAM2: Extending Fedora 4 and Hydra for Media Preservation

Long Term Digital Preserva2on

From strings to things

An overview of the OAIS and Representation Information

Global Reference Architecture: Overview of National Standards. Michael Jacobson, SEARCH Diane Graski, NCSC Oct. 3, 2013 Arizona ewarrants

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

ebooks Preservation at Scholars Portal Kate Davis & Grant Hurley Scholars Portal, Ontario Council of University Libraries

Unlocking the full potential of location-based services: Linked Data driven Web APIs

Introduction to Islandora Kim Pham, Digital Projects & Technologies Librarian (UTSC) Kelli Babcock, Digital Initiatives Librarian (UTL)

COLLABORATIVE EUROPEAN DIGITAL ARCHIVE INFRASTRUCTURE

SEPA SPARQL Event Processing Architecture

Open And Linked Data Oracle proposition Subtitle

Preserva'on*Watch. What%to%monitor%and%how%Scout%can%help. KEEP%SOLUTIONS%

Cloud Analytics and Business Intelligence on AWS

The Eclipse Modeling Framework and MDA Status and Opportunities

DBpedia Data Processing and Integration Tasks in UnifiedViews

Core Technology Development Team Meeting

Modern Stream Processing with Apache Flink

WHITEPAPER. MemSQL Enterprise Feature List

(incubating) Introduction. Swapnil Bawaskar.

Research Data Repository Interoperability Primer

James Hardiman Library. Digital Scholarship Enablement Strategy

The Emerging Data Lake IT Strategy

The CASPAR Finding Aids

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

From Open Access to Open Standards, (Linked) Data and Collaborations

Building a Data Strategy for a Digital World

Practical Semantic Applications Master Title for Oil and Gas Asset Reporting. Information Integration David Price, TopQuadrant

Can a Consortium Build a Viable Preservation Repository?

Policy-Driven Repository Interoperability: Enabling Integration Patterns for irods and Fedora

Composable Software, Collaborative Development, and the CareWeb Framework. Doug Martin, MD

3) Develop a user interface to facilitate discovery, delivery and rights management of media content

Digital Preservation Standards Using ISO for assessment

Information Workbench

EPIM ReportingHub (ERH) A new powerful knowledge sharing platform. Houston, March 2012 Ph.D. Kari Anne Haaland Thorsen

Linked Data and RDF. COMP60421 Sean Bechhofer

The Semantic Institution: An Agenda for Publishing Authoritative Scholarly Facts. Leslie Carr

Red Hat JBoss Data Virtualization 6.3 Glossary Guide

SciENCV - Putting the Pieces Together VIVO

Introduc)on to the Linked Data Pla4orm (LDP) Hector Correa The Pennsylvania State University

Working with a Preservation Software Vendor - The Kentucky Experience Glen McAninch

The P2 Registry

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Project Proposal: OSLC4MBSE - OMG SE and OSLC working group as part of the OMG SE DSIG. OSLC for Model-Based Systems Engineering Interoperability

Science-as-a-Service

Azure-persistence MARTIN MUDRA

Records management workflows

A distributed network of digital heritage information

Select all persons who belong to the class Father. SPARQL query PREFIX g: <

Transcription:

Introducing Fedora 4 Overview, examples, and features David Wilcox, DuraSpace @d_wilcox

https://goo.gl/9k9rlk

Learning Outcomes Understand the purpose of a Fedora repository Learn what Fedora can do for you Understand the key capabilities of the software

Fedora Facts Managed by DuraSpace (not-for-profit) Funded by the community Collaboratively developed by the community Supported by 2 full-time staff members (not developers)

Flexible Extensible Durable Object Repository Architecture Concept Implementation Community

Fedora... Stores, preserves, and provides access to digital objects Supports flexible and complex content models for objects Supports complex semantic relationships between objects inside and outside the repository using RDF Supports millions of objects, both large and small Interoperates with other applications and services

Why use Fedora? Fedora is flexible: it can handle both simple and complex use cases Content in Fedora is durable: Fedora supports long-term preservation Fedora powers successful digital repository and DAM applications Fedora is standards-based Fedora is backed by a thriving community

Fedora Front-Ends Fedora is middleware You can build a custom framework, or join a broader community:

Fedora Examples

Institutional Repository https://scholarsphere.psu.edu/

Research Data https://era.library.ualberta.ca/

Manuscripts https://archbishopsregisters.york.ac.uk

Archives and Special Collections http://digitalcollections.barnard.edu

Basic Concepts

Web Resources Everything is a web resource with a URI Resources have properties expressed as RDF triples Resources can contain other resources (containers) or files (binaries)

Page1.jpg Book Example Page 1 Page1.tiff Container Page2.jpg Book 1 Page 2 Page2.tiff Book Collection Page1.jpg Book 2 Page 1 Page1.tiff Page2.jpg Page 2 Page2.tiff Binary

RDF Properties

Core Features

Fedora as an Implementation (up to 4.6.x)

Fedora as an Implementation (4.7+)

Fedora as an API Aligning with standards Focal point for community discussion and consensus on what a Fedora repository does Enables standard client libraries Enables standard server components

Standards Focus on existing standards Fewer customizations to maintain Opportunities to participate in related communities

Core Features and Standards 1. Create/Read/Update/Delete - Linked Data Platform 2. Versioning - Memento 3. Authorization - WebAC 4. Batch Operations 5. Fixity

Versioning Versions can be created on demand via the REST-API A previous version can be restored via the REST-API

Authorization Authorization is optional and pluggable Currently, four authorization implementations exist: None, WebAC, Role-based and XACML

Web Access Control W3C approach for managing authorization using linked data Interoperable with other applications that implement the same approach Implemented in Fedora 4 by community stakeholders

Batch Operations Multiple actions can be bundled together into a single repository event Batch operations can be rolled back or committed Can be used to maintain consistency

Fixity Over time, digital objects can become corrupt Fixity checks help preserve digital objects by verifying their integrity On ingest, Fedora can verify a user-provided checksum against the calculated value A checksum can be recalculated and compared at any time via a REST-API request

Messaging* Fedora emits JMS messages on repository events Message body is RDF, serialized to JSON-LD (default) Messaging allows external integrations to react asynchronously to events

External Services

Two Feature Types 1) Optional, pluggable components Separate projects that can interact with Fedora 4 using a common pattern 2) External components Consume and act off repository messages

External Component Integrations Leverages the well-supported Apache Camel project Camel is middleware for integration with external systems Can handle any asynchronous, event-driven workflow

External Component Architecture Fedora 4 Apache Camel LDP / WebAC / Memento Solr Triplestore (Fuseki, Sesame) Audit Service SPARQL-Query

External - Indexing Index repository content for search Indexing is configurable - could be based on any property Solr and Elastic Search have been tested

External - Triplestore An external triplestore can be used to index the RDF triples of Fedora resources Any triplestore that supports SPARQL-update can be plugged in Fuseki, RDF4J, and BlazeGraph have been tested

External/Pluggable - Audit Service Maintains a history of events for each repository resource Both internal repository events and events from external sources can be recorded Uses the existing event system and an external triplestore Events can be persisted back to Fedora

Performance and Scalability

Test Plans Testing large files, many files, and many containers Tests are performed by community members Have concerns about performance and scale? Join the group!

Metrics A number of scalability tests have been run: Uploaded a 1 TB file via REST API 17 million objects via REST API 3.5 million files via REST API

Roadmap

2016-2017 Support standardized import/export Specify RESTful API Defining Fedora by its RESTful services Aligning those services with standards Specifying RESTful services Creating technology compatibility kit (TCK) Separating service specification from server-side implementation

Useful Resources Fedora 4 documentation Fedora 4 wiki Fedora 4 mailing lists