The Curator s Approach to Data Management and Sustainability

Similar documents
UC Irvine LAUC-I and Library Staff Research

Developing a Research Data Policy

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

DataFlow and VIDaaS Workshop

Data Curation Profile Human Genomics

How to share research data

Writing a Data Management Plan A guide for the perplexed

NSF Data Management Plan Template Duke University Libraries Data and GIS Services

How to make your data open

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

2/24/2015. What are File Formats? Types of File Formats. Sustainable Formats. File formats can be grouped into three categories:

Science Europe Consultation on Research Data Management

Perspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe

Open Access & Open Data in H2020

The Data Curation Profiles Toolkit: Interview Worksheet

Summary of Bird and Simons Best Practices

Protecting Future Access Now Models for Preserving Locally Created Content

GEOSS Data Management Principles: Importance and Implementation

Focus: Themes within Introduction and Context

Data Management Checklist

How to assist researchers in sharing their research data October 22, 2015

The Data Management Plan: Putting policy into practice Suzanne Clarke Director, Information Resources

Repository models and policies for preservation

Plan for implementing a uniform content ingestion system for SPO

Your Open Science and Research Publishing Platform. 1st SciShops Summer School

Developing an Electronic Records Preservation Strategy

Web of Science. Platform Release Nina Chang Product Release Date: March 25, 2018 EXTERNAL RELEASE DOCUMENTATION

Long-term digital preservation of UNSWorks

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

INF - INFORMATION SCIENCES

Where to store research data during and after a project. Dr. Chris Emmerson Research Data Manager

Reproducibility and Reuse of Scientific Code Evolving the Role and Capabilities of Publishers

Survey of Research Data Management Practices at the University of Pretoria

WEB-BASED COLLECTION MANAGEMENT FOR LIBRARIES

SciVerse Scopus. 1. Scopus introduction and content coverage. 2. Scopus in comparison with Web of Science. 3. Basic functionalities of Scopus

Reflections on Three Decades in Internet Time

Legal Issues in Data Management: A Practical Approach

A Brief Introduction to the Data Curation Profiles

Data Management Plan Generic Template Zach S. Henderson Library

Horizon Societies of Symbiotic Robot-Plant Bio-Hybrids as Social Architectural Artifacts. Deliverable D4.1

Data Curation Handbook Steps

The library s role in promoting the sharing of scientific research data

Managing Records in Electronic Formats. An Introduction

Feed the Future Innovation Lab for Peanut (Peanut Innovation Lab) Data Management Plan Version:

Persistent identifiers, long-term access and the DiVA preservation strategy

GUIDELINES FOR CREATION AND PRESERVATION OF DIGITAL FILES

Horizon 2020 Open Research Data Pilot: What is required? Sarah Jones Digital Curation Centre

Basics in good research data management (RDM) for reviewing DMPs

Big Data infrastructure and tools in libraries

UK Institutional Repository Search Project

National Materials Data Initiatives

National Snow and Ice Data Center. Plan for Reassessing the Levels of Service for Data at the NSIDC DAAC

Article begins on next page

National Snow and Ice Data Center. Plan for Reassessing the Levels of Service for Data at the NSIDC DAAC

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS).

Understanding, Finding, and Using Data Spring 2008

This work is licensed under the Creative Commons Attribution 4.0 International License. Page 1 of 10

Checklist for Rule 16(c) Pretrial Conference for Computer-Based Discovery

Content Management for the Defense Intelligence Enterprise

Illinois Data Bank FIRST YEAR REVIEW COMPILED BY RESEARCH DATA SERVICE STAFF

Dexterity: Data Exchange Tools and Standards for Social Sciences

RECOMMENDED FILE FORMATS

Copyright 2010 Redstone Content Solutions LLC OCM & WCM Training Agenda Revised Thursday, November 18, 2010

WEB-BASED COLLECTION MANAGEMENT FOR ARCHIVES

For more information about how to cite these materials visit

Research Data Management

Best Practice Guidelines for the Development and Evaluation of Digital Humanities Projects

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Horizon 2020 and the Open Research Data pilot. Sarah Jones Digital Curation Centre, Glasgow

Survey of research data management practices at the University of Pretoria, South Africa: October 2009 March 2010

Proposals for a New Workflow for Level-4 Content

Performing searches on Érudit

ProQuest Dissertations and Theses Overview. Austin McLean and Marlene Coles CGS Summer Workshop, July 2017

Introduction to Digital Preservation. Danielle Mericle University of Oregon

Data Curation Profile Botany / Plant Taxonomy

Metadata and Encoding Standards for Digital Initiatives: An Introduction

COALITION ON PUBLISHING DATA IN THE EARTH AND SPACE SCIENCES: A MODEL TO ADVANCE LEADING DATA PRACTICES IN SCHOLARLY PUBLISHING. Source: NSF.

NRF Open Access Statement

Hospital System Lowers IT Costs After Epic Migration Flatirons Digital Innovations, Inc. All rights reserved.

The Canadian Information Network for Research in the Social Sciences and Humanities.

Developing Data Management Plans (DMP) Scholarly Communication Initiative Mississippi State University Libraries March 25, 2015

How to contribute information to AGRIS

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017

Data Archival and Dissemination Tools to Support Your Research, Management, and Education

Xyleme Studio Data Sheet

NOW ON. Mike Takats Thomson Reuters April 30, 2013

Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing Trusted Digital Repositories

Data Management Planning

Swedish National Data Service, SND Checklist Data Management Plan Checklist for Data Management Plan

Module B1 An Introduction to TOGAF 9.1 for those familiar with TOGAF 8

A Dublin Core Application Profile for Scholarly Works (eprints)

DRI: Dr Aileen O Carroll Policy Manager Digital Repository of Ireland Royal Irish Academy

Introduction to Data Management for Ocean Science Research

OAIS: What is it and Where is it Going?

SNHU Academic Archive Policies

You may print, preview, or create a file of the report. File options are: PDF, XML, HTML, RTF, Excel, or CSV.

User guide. Created by Ilse A. Rasmussen & Allan Leck Jensen. 27 August You ll find Organic Eprints here:

Building a Digital Repository on a Shoestring Budget

Comments on the document ENTR/02/21-IDA/MIDDLEWARE-XML: Comparative assessment of Open Documents Formats Market Overview as submitted by Valoris

Research Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh

Transcription:

The Curator s Approach to Data Management and Sustainability Nic Weber & Megan Senseney Center for Informatics Research in Science & Scholarship Graduate School of Library & Information Science University of Illinois at Urbana-Champaign Digital Humanities at Oxford Summer School 14-18 July 2014

Agenda Data management...as a DH technique valued ends available resources DMP Agency Mandates DMP beyond two pages Sustainability Significant properties 2 Case studies in DH sustainability

I m trying to deflate the idea of digital humanities from a domain to an underlying set of practices 6 July DH 2014

DM as a DH Technique Many different Techniques

Data Management as a DH Technique the ensemble of practices by which one uses available resources in order to achieve certain valued ends. Harold Lasswell

Valued Ends Preservation of Knowledge (material artifacts that are produced, as well as ways of knowing) Maximize the value of public investment Increase the efficiency of doing digital humanities research both immediate and long-term.

The Royal Society Science Policy Centre. (2012). Science as an open enterprise. Page 60. 7

Data management Is highly personal Interpersonal when collaborating Intrapersonal in our relationship with institutions, organizations and funding agencies

! =

Data management techniques include concerns of Planning ( more in a bit ) / Costing Documentation Formatting Storage Copyright / IP / Licensing

Documentation

Documentation : tricks and tips Include a header line that describes the variables as the first line in the table. Use plain ASCII text for your file names, variable names, and data values. Record naming schemes (<- develop naming schemes) When you export from an analysis environment (e.g. SPSS, R, Gephi, etc.) record transformations in a separate: readme_(filename).txt file

Storage & Formatting!

Storage : DIY Cyberinfrastructure

Formatting & Storage: Tricks and Tips Store data in nonproprietary software formats (e.g., comma delimited text file,.csv); proprietary software (e.g., Excel, Access) can become unavailable, whereas text files can always be read. When in an analysis stage - store an uncorrected (raw) data file. Do not make any corrections to this file; make corrections within a scripted language. Modified from: https://www.nceas.ucsb.edu/content/simple-guidelines-effective-data-management

Copyright / IP slide

IP: Tricks of Trade Melissa Levine s Checklist on the DH Curation Guide: http://guide.dhcuration.org/legal/policy/#p05

Data Management Planning Is highly social Dialectic (optimal vs. practical) Plans change

DMP Mandates (Funding Agencies) Peer Reviewed Components Enforcement AHRC Yes Summary of Digital Outputs and Digital Technologies; Technical Methodology; Standards and Formats; Hardware and Software; Data Acquisition, Processing, Analysis and Use; Technical Support and Relevant Experience; Preservation, Sustainability and Use; Preserving Your Data; Ensuring Continued Access and Use of Your Digital Outputs NEH YES Expected types of data Period of data retention Data forms and dissemination Data storage and preservation EU No Data set reference and name Data set description Standards and metadata Data sharing Archiving and preservation Unclear YES Sliding

AHRC Example Project: Kitchen Cosmology Project University of Bristol. PI: Dr. Rita Langer. Link: http://bit.ly/1n0evun NEH Example Project: A unified approach to preserving cultural software objects and their development histories : UC Santa Cruz. PI Noah Wardrip-Fruin Link: http://1.usa.gov/1knxm8n

completed worksheets

Costing Tricks and Tips 4C: Overview of 10 curation cost models: http://bit.ly/1ldmuft provides a short description of each of the models and a presentation of their core features

More tricks of the trade slide Advertise your data Say how you would like it to be cited (paper? data? both?) State known limitations (fit-for-purpose) Rely on journals, repositories and colleagues for guidance Don t rely on journals, repositories or colleagues for guidance

How do projects end? SUSTAINABILITY

Why this matters to DC Fundamental questions of digital preservation: 1. What must you retain to ensure the integrity and authenticity of the digital object? 2. What can you lose without potential implications?

Significant Properties characteristics of an information object that must be maintained to ensure that object s continued access, use, and meaning over time as it is moved to new technologies. (Wilson, 2007).

Five categories of SPs Content Context Rendering Structure Behavior

Criteria for deciding significance Grace, S. & Knight, G. (2008)

Case study 1 : Sustainability GLOBALIZATION AND AUTONOMY ONLINE COMPENDIUM Rockwell, Day, Yu, and Engel (2014) Burying Dead Projects: Depositing the Globalization Compendium. Digital Humanities Quarterly; 8 (002). http://www.digitalhumanities.org/dhq/vol/8/2/000179/000179.html

Then we came to (planning for) the end http://globalautonomy.ca/global1/index.jsp End of what? - XML files with content; - A MySQL bibliographic database; - A metadata database of the content for generating topical pages and for searching; - A full text index for searching the text; - The code that handles the dynamic generation of the site, the searching, linking, and the XSL transforms; - Some HTML pages and CSS stylesheets; - And various images that are embedded in pages.

The experience of the Compendium is that the intellectual work is not only in the individual articles, or even in the bibliographic data it is in the interaction between these, mediated by code and in the user experience. Rockwell et al. 2014

What was deposited? Content: the texts, including bibliography, and glossary. We also considered the text on the HTML pages content. Code: HTML, CSS, and includes the XSLT code that generated much of the interface Process: materials (but not all) that document the editorial processes, including the editorial backend that strictly speaking was not part of the Compendium as experienced. The User Experience: information about the experience of the Compendium as an interactive work by writing a narrative along with screen shots of typical use of the Compendium stored as PDFs

Five categories of SPs Content Context Rendering Structure Behavior Rockwell s Categories Content Code Process User Experience

Case study 2 : Sustainability PERSEUS DIGITAL LIBRARY

How would Perseus End? (hint not by beheading Medusa)

RESOURCE LIST Rockwell, Day, Yu, and Engel (2014) Burying Dead Projects: Depositing the Globalization Compendium. Digital Humanities Quarterly; 8 (002). http://www.digitalhumanities.org/dhq/vol/8/2/000179/000179.html Grace, S. & Knight, G. (2008) What are significant properties and why should I care? Presentation delivered at Digital Curation 101, October, 7 2008. Edinburgh, Scotland