Preserva'on*Watch. What%to%monitor%and%how%Scout%can%help. KEEP%SOLUTIONS%www.keep7solu:ons.com

Similar documents
Preservation Planning in the OAIS Model

Report on compliance validation

Current Digital Preservation Trends and SDB4 Features. Pauline Sinclair, Tessella, PASIG, Madrid, 5 July 2010

Research projects as a driving force for open source development and a fast route to market

Digital Preservation at NARA

DRI: Dr Aileen O Carroll Policy Manager Digital Repository of Ireland Royal Irish Academy

*This version is a voluntary deposit by the author. The publisher s version is available at:

Archive II. The archive. 26/May/15

Building for the Future

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Ex Libris Rosetta A Complete Digital Asset Management and Preservation System

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland

Managing Data in the long term. 11 Feb 2016

Importance of cultural heritage:

Introduction to. Digital Curation Workshop. March 14, 2013 SFU Wosk Centre for Dialogue Vancouver, BC

Long-term digital preservation of UNSWorks

warcinfo: contains information about the files within the WARC response: contains the full http response

National Documentation Centre Open access in Cultural Heritage digital content

Fedora Commons Update

Susan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October

Fedora Commons: Taking on the Challenge of the Next Generation of Scholarly Communication

SNIA 100 Year Archive Survey 2017 Thomas Rivera, CISSP

Digital repositories as research infrastructure: a UK perspective

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

A distributed network of digital heritage information

Digital Preservation in the UK. David Thomas

Richard Marciano Alexandra Chassanoff David Pcolar Bing Zhu Chien-Yi Hu. March 24, 2010

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

Infrastructure for the UK

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,

The Ohio State University's Knowledge Bank: An Institutional Repository in Practice

Digital Preservation Workshop

Increasing access to OA material through metadata aggregation

Digital Preservation: How to Plan

Digital Preservation Workshop

Registry Interchange Format: Collections and Services (RIF-CS) explained

B2SAFE metadata management

Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector

Introduction

Its All About The Metadata

GN4-1 SA8 Real Time Applications and Multimedia Management. Networks Services People 1

Fusion Registry 9 SDMX Data and Metadata Management System

BPMN Processes for machine-actionable DMPs

Certification. F. Genova (thanks to I. Dillo and Hervé L Hours)

DATAVERSE FOR JOURNALS

strategy IT Str a 2020 tegy

Joining the BRICKS Network - A Piece of Cake

SMART CONNECTOR TECHNOLOGY FOR FEDERATED SEARCH

GDPR: An Opportunity to Transform Your Security Operations

Ing. José A. Mejía Villar M.Sc. Computing Center of the Alfred Wegener Institute for Polar and Marine Research

Show me the data. The pilot UK Research Data Registry. 26 February 2014

Simple Archive Architectures

Jeff Wilbur VP Marketing Iconix

An Introduction to PREMIS. Jenn Riley Metadata Librarian IU Digital Library Program

RODA-in. A generic tool for the mass creation of Submission Information Packages

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

Introducing Fedora 4. Overview, examples, and features. David Wilcox,

RODA 3 LONG-TERM DIGITAL PRESERVATION CHARACTERISTICS AND TECHNICAL REQUIREMENTS

First Look Showcase. Expanding our prevention, detection and response solutions. Sumedh Thakar Chief Product Officer, Qualys, Inc.

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...)

OAI-PMH for dummies: how to build an institutional repository with limited resources?

EUROPEANA METADATA INGESTION , Helsinki, Finland

MuseKnowledge Hybrid Search

An Introduction to Digital Preservation

The OAIS Reference Model: current implementations

Networking European Digital Repositories

Assessment of product against OAIS compliance requirements

Long-Term Preservation Services

DURAARK. Ex Libris conference April th, 2013 Berlin. Long-term Preservation of 3D Architectural Data

ORCA-Registry v2.4.1 Documentation

Global Cryosphere Watch. GCW web portal Structure and demonstration

Microsoft SharePoint Server

Nuno Freire National Library of Portugal Lisbon, Portugal

The PLANETS Testbed. Max Kaiser, Austrian National Library

Digital Preservation DMFUG 2017

ISO Self-Assessment at the British Library. Caylin Smith Repository

Implementation of the Data Seal of Approval

DA-NRW: a distributed architecture for long-term preservation

Oracle WebLogic Diagnostics and Troubleshooting

Taking the plunge: digital archives at HSBC

First Look Showcase. Expanding our prevention, detection and response solutions. Marco Rottigni Chief Technical Security Officer, Qualys, Inc.

Certification Efforts at Nestor Working Group and cooperation with Certification Efforts at RLG/OCLC to become an international ISO standard

Challenges and opportunities in digital preservation. what do I need to know about digital preservation? It won t go away It won t do itself

Assessment of product against OAIS compliance requirements

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

Applied Data Governance - Part 3

Emulation: the debate, registries and PREMIS

DOIs for Research Data

Working with Islandora

2016 Data Protection & Breach Readiness Webinar Will Start Shortly. please download the guide at

Turning Risk into Advantage

Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure

Building a Digital Library Software

Final version of Policy specification model

Protecting Future Access Now Models for Preserving Locally Created Content

SciX Open, self organising repository for scientific information exchange. D15: Value Added Publications IST

By: Kim Schroeder. Lecturer SLIS WSU A Presentation to the NDSA and SAA Wayne State University Student Groups

Networking European Digital Repositories

Semantic MediaWiki (SMW) for Scientific Literature Management

Transcription:

Preserva'on*Watch What%to%monitor%and%how%Scout%can%help Luis%Faria%lfaria@keep.pt KEEP%SOLUTIONS%www.keep7solu:ons.com Digital%Preserva:on%Advanced%Prac::oner%Course Glasgow,%15th719th%July%2013

KEEP$SOLUTIONS Company%specialized%in%informa:on%management Digital%preserva:on%experts Open%source:%RODA,%KOHA,%DSpace,%Moodle,%etc. Scien:fic%research SCAPE:%large7scale%digital%preserva:on%environments 4C:%digital%preserva:on%cost%modeling h/p://www.keep6solu'ons.com This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). 2

Preservation monitoring 3

Why do we need monitoring? Format obsolescence New standards Emerging technology Organisation mission Organisation policies Repository Bit rot Resource capability System availability Security breach Producer trends Consumer trends Economical limitations Social and political factors 4

Why do we need monitoring? Format obsolescence New standards Emerging technology Repository Organisation mission Organisation policies Bit rot Resource capability System availability Security breach Producer trends Consumer trends Risks Opportunities Economical limitations Social and political factors 5

SCAPE State of the Art Digital Format Registries Automatic Obsolescence Notification System (AONS) Technology watch reports 6

SCAPE State of the Art Digital Format Registries Lack of coverage Statically-defined generic risks Lack of structure in risks Focus on format obsolescence AONS Total dependency on format registries Technology watch reports Machine unreadable 7

Risk Assessment Yes but manual and adhoc None 40% Survey on: 60% 8

Monitoring Automatic Manual None Bitstream integrity Format obsolescesce Ingest Access Organization Format registries Experimentation Consumers Producers Technology 0% 20% 40% 60% 80% 100% 9

SCAPE What is needed? We need data! From anywhere and everywhere Sharing Usability & Scalability Structured data Controlled vocabulary 10

Scout A novel approach 11

? 12

? Tool Scout Format Name Name Version PRONOM ID Renders Mime type License License 13

? Tool Scout Format Name Name Version PRONOM ID Renders Mime type License License 13

? Tool Scout Format Name Name Version PRONOM ID Renders Mime type License License 13

? Tool Scout Format Name Version Name PRONOM ID PRONOM Renders Mime type License License 14

? Tool Scout Format Name Version Name PRONOM ID PRONOM Renders Mime type License License 14

? Scout Tool Format Name Name Version PRONOM ID Renders Mime type License License PRONOM 15

? Scout Tool Format Name Name Version PRONOM ID Renders Mime type License License PRONOM 16

SCAPE Goals Collect information from different sources Enable human input of data Central knowledge base for digital preservation Enable users to pose questions Notify users of significant events and plan validity Easily support for new sources and questions 17

Scout:$a$preserva7on$watch$system Monitors%aspects%of%the%world%to% detect%preserva:on%risks%and% opportuni:es Triple%store Adaptors Data%Connector%&%Report%API SCAPE%Policy%model PRONOM Web%seman:c%extrac:on Renderability%experiments Web%interface Triggers:%templates%and%SPARQL Email%no:fica:ons Demo:%h]p://scout.scape.keep.pt Content Policies Registries Scout Risk notification Web Human knowledge h]p://openplanets.github.io/scout/ This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). 18

Preserva7on$lifecycle access, ingest, harvest Environment and users monitored environment and users monitored content and events monitored actions Watch Policies create/re-evaluate plans Repository execute action plan Operations deploy plan Planning This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). 19

Preserva7on$lifecycle$(in$prac7ce) access, ingest, harvest Environment and users monitored environment and users monitored content and events monitored actions Scout Watch Policies create/re-evaluate plans Repository execute action plan deploy plan Planning Plato Operations Workflow%engine This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). 20

Architecture Scout C3PO API C3PO Scout API Plato Notification API Data Connector API Repository Workflow engine Workflow execution API Plan Management API Report API This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). 21

Architecture Scout C3PO API C3PO Small*scale:*Taverna Large*scale:*SCAPE*plaAorm Scout API Plato Notification API Data Connector API Repository Workflow engine Workflow execution API Plan Management API Report API Ref.*implementa'ons: RODA*by*KEEPS Fedora04*by*FIZ This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). 22

Scout Architecture Pull Source Adaptors Push Source Adaptor API Web Interface REST API Notification Service External Assessment Data Enrichment Service Monitor Service Assessment Service Knowledge Base 23

SCAPE Example questions Are there any tools that can render the format X? Is my repository the only one that has format Y? Are my preservation plans still valid? Are my repository policies being enforced? 24

SCAPE Information Sources Format registries & software catalogues Digital repositories & web archives Organizational objectives Experiments Simulation Human knowledge 25

Normalized data model Model (managed) Entity Type name: File Format Data (source created) Entity name: JPEG2000 Property name: PRONOM Id Property Value value: x-fmt/392 Property name: Tool support Property Value value: limited 26

Information source adaptor World Watch System Source «pull events» «interface» Pull Source Adaptor runs * 1 Scheduler «sends data» Source Push Source Adaptor «push events» Push Source Adaptor API «send data» Data Enrichment Service 27

C3PO Content profile tool Characterization Reports: Aggregation Analysis Representative datasets h]ps://github.com/peshkira/c3po Petar%Petrov%<petrov@ifs.tuwien.ac.at> 28

Repository%content

Repository events API and adaptor RODA RODA Report API Rosetta Rosetta Report API OAI-PMH Repository adaptor Scout escidoc escidoc Report API OAI-PMH with PREMIS Normalize events Fine-grain events History Events example Ingest started/ended Representation downloaded Plan executed 30

Repository$Events$API$(Report$API) Provides%access%to%repository%events Events: Ingest%started%and%finished Viewed%or%downloaded%descrip:ve%metadata%or%representa:on Preserva:on%plan%executed OAI7PMH%data%provider PREMIS%events%metadata Agent:%who%triggered%the%event Date/:me:%when%did%the%event%occur Details:%what%happened API%specifica:on:%h]ps://github.com/openplanets/scape7placorm7api Ref.%implementa:on:%%h]ps://github.com/openplanets/roda This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). 31

Report%API%(RODA%reference%implementa:on) h7ps://github.com/openplanets/roda/tree/master/rodaccore/rodaccorecservices/src/main/java/eu/scape_project/roda/report

Repository%events

Web*archive*adaptor Content%via%C3PO IM7C3PO%prototype%integra:on%(201072012) SB7C3PO%prototype%integra:on%(large7scale:%300%TB) Renderability%analysis%experiments: Browser%snapshots%comparison%large7scale%placorm%prototype C3PO%adaptor%for%experiment%results Scout%C3PO%adaptor%profile%support Other%web%characteriza:on%info: ARC%header%extractor%(ongoing)

Web%archive%content%(IM)

12%TB,%440M%FITS%files Web*archive*content*(SB) Test%case%1%7%Import Linear%ingest%:me%of%0.65%ms%for%FITS%file Test%case%2%7%GUI 2.5%million%FITS%files%limit Generate%profile%in%command7line 15%hours%for%12M%files

Web%archive%renderability%analysis

ARC%Header%extractor%tool h7ps://github.com/statsbiblioteket/arccheadercextractor

Define triggers Notify me when there are tools that can render the format X. 39

Define triggers Simple query with templates 40

Receive notifications Email There*are*tools*that*can*render*format*X. HTTP Push API 41

Interfaces Web page REST API 42

h]p://scout.scape.keep.pt

h]p://scout.scape.keep.pt

h]p://scout.scape.keep.pt

h]p://scout.scape.keep.pt

h]p://scout.scape.keep.pt

h]p://scout.scape.keep.pt

h]p://scout.scape.keep.pt

h]p://scout.scape.keep.pt

h]p://scout.scape.keep.pt

SCAPE How to be a part of Scout Join the surveys Send me your email address <lfaria@keep.pt> Integrate your content Send your content profile with C3PO Send repository events with Report API Contribute with information (soon) Use Scout form for manual input of knowledge 44

SCAPE Benefits Synergy: together we can do more Sharing: know about your peers Centralize knowledge: holistic view of influencers Traceability: record the inputs to decision-making Find opportunities: reduce costs and optimize 45

Roadmap User%support More%trigger%templates More%adaptors KrakeN Soiware%catalogues Other%format%registries Other%experiments%informa:on%sources Manual%input%(human%knowledge) Simula:on This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). 46

Preserva'on*Watch What%to%monitor%and%how%Scout%can%help Luis%Faria%lfaria@keep.pt KEEP%SOLUTIONS%www.keep7solu:ons.com Digital%Preserva:on%Advanced%Prac::oner%Course Glasgow,%15th719th%July%2013