Implementation of Open-World, Integrative, Transparent, Collaborative Research Data Platforms: the University of Things (UoT)

Similar documents
FAIR-aligned Scientific Repositories: Essential Infrastructure for Open and FAIR Data

Reproducibility and FAIR Data in the Earth and Space Sciences

Scholix Metadata Schema for Exchange of Scholarly Communication Links

Quick Start Guide to Aurora. Contents

Persistent Identifier the data publishing perspective. Sünje Dallmeier-Tiessen, CERN 1

The Experimental Project of DOI Registration for Research Data at Japan Link Center (JaLC)

Bengkel Kelestarian Jurnal Pusat Sitasi Malaysia. Digital Object Identifier Way Forward. 12 Januari 2017

Mercè Crosas, Ph.D. Chief Data Science and Technology Officer Institute for Quantitative Social Science (IQSS) Harvard

The DOI Identifier. Drexel University. From the SelectedWorks of James Gross. James Gross, Drexel University. June 4, 2012

NRF Open Access Statement

Services to Make Sense of Data. Patricia Cruse, Executive Director, DataCite Council of Science Editors San Diego May 2017

Handles at LC as of July 1999

DOIs for Research Data

The Smithsonian/NASA Astrophysics Data System

The Virtual Observatory and the IVOA

Demos: DMP Assistant and Dataverse

Collecting data types from the thermodynamic data and registering those at the DCO data portal.

LIBER Webinar: A Data Citation Roadmap for Scholarly Data Repositories

Standards in Science Publishing: Persistent Person Identifiers. CSE Meeting, Montreal 5 May 2013

DOIs for Scientists. Kirsten Sachs Bibliothek & Dokumentation, DESY

FREYA Connected Open Identifiers for Discovery, Access and Use of Research Resources

WEB OF SCIENCE TM RELEASE NOTES v5.21

Crossref DOIs help to uniquely identify and therefore link content

Software + Services for Data Storage, Management, Discovery, and Re-Use

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

PDS, DOIs, and the Literature. Anne Raugh, University of Maryland Edwin Henneken, Harvard-Smithsonian Center for Astrophysics

SHARING YOUR RESEARCH DATA VIA

The Materials Data Facility

Adoption of Data Citation Outcomes by BCO-DMO

Digital repositories as research infrastructure: a UK perspective

SEAD Data Services. Jim Best Practices in Data Infrastructure Workshop. Cooperative agreement #OCI

1. Create References by Adding PDF Documents to Your Library

Collecting data types from the thermodynamic data and registering those at the DCO data portal.

Open Access, RIMS and persistent identifiers

Update on Dataverse Dryad-Dataverse Community Meeting. Mercè Crosas, Elizabeth Quigley & Eleni Castro. Data Science > IQSS > Harvard University

5/16/2018. Researcher Challenges with Data Use. AGU s position statement on data affirms that

WEB OF SCIENCE TM RELEASE NOTES v5.21

Dataset Documentation Reference Guide for Pure Users

Implementing the RDA Data Citation Recommendations for Long Tail Research Data. Stefan Pröll

re3data.org - Making research data repositories visible and discoverable

First Light for DOIs at ESO

A Data Citation Roadmap for Scholarly Data Repositories

DMPonline / DaMaRO Workshop. Rewley House, Oxford 28 June DataStage. - a simple file management system. David Shotton

ISMTE Best Practices Around Data for Journals, and How to Follow Them" Brooks Hanson Director, Publications, AGU

Perspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe

The library s role in promoting the sharing of scientific research data

1. Understand what persistent identifiers are, how they work and the benefits to using them in a DSpace repository environment

Introduction to Data Management for Ocean Science Research

Welcome to the Pure International Conference. Jill Lindmeier HR, Brand and Event Manager Oct 31, 2018

Roy Lowry, Gwen Moncoiffe and Adam Leadbetter (BODC) Cathy Norton and Lisa Raymond (MBLWHOI Library) Ed Urban (SCOR) Peter Pissierssens (IODE Project

DATAVERSE FOR JOURNALS

CrossRef tools for small publishers

BPMN Processes for machine-actionable DMPs

Dataverse and DataTags

USE CASES IN SEISMOLOGY. Alberto Michelini INGV

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

Making Sense of Data: What You Need to know about Persistent Identifiers, Best Practices, and Funder Requirements

Facilitate Open Science Training for European Research

Securing Dataverse with an Adapted Command Design Pattern. Gustavo Durand, Michael Bar-Sinai, Merce Crosas SecDev - September 26, 2017

Your Open Science and Research Publishing Platform. 1st SciShops Summer School

Tracking data usage at NCAR s Research Data Archive. Steven Worley Computational and Information System Laboratory NCAR

Making the most of metadata with Metadata 2020

Integrating Data with Publications: Greater Interactivity and Challenges for Long-Term Preservation of the Scientific Record

Introduction to INEXDA s Metadata Schema

Ontology and Application for Reusable Search Interface Design

POMELo: A PML Online Editor

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

From Open Data to Data- Intensive Science through CERIF

Getting started with New Proquest RefWorks

RADAR. Establishing a generic Research Data Repository: RESEARCH DATA REPOSITORY. Dr. Angelina Kraft

DigitalHub Getting started: Submitting items

Data Management at NIST

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,

EUDAT. Towards a pan-european Collaborative Data Infrastructure

The Data Curation Profiles Toolkit: Interview Worksheet

Interoperability Framework Recommendations

INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES

Illinois Data Bank Metadata Documentation

Robin Wilson Director. Digital Identifiers Metadata Services

Callicott, Burton B, Scherer, David, Wesolek, Andrew. Published by Purdue University Press. For additional information about this book

ORCID in Bloom. ORCID and EM Use Cases. Sean MacRae, Business Systems Analyst

Big Data infrastructure and tools in libraries

The changing face of visualisation in a world of data intensive science.

For Attribution: Developing Data Attribution and Citation Practices and Standards

1. Download and install the Firefox Web browser if needed. 2. Open Firefox, go to zotero.org and click the big red Download button.

Reproducibility and Reuse of Scientific Code Evolving the Role and Capabilities of Publishers

Linking datasets with user commentary, annotations and publications: the CHARMe project

XML: Why and How JATS

Research Data Sharing Framework to Enhance Open Science

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data

Making research data repositories visible and discoverable. Robert Ulrich Karlsruhe Institute of Technology

OOI CyberInfrastructure Conceptual and Deployment Architecture

OpenAIRE From Pilot to Service The Open Knowledge Infrastructure for Europe

PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA

Data Exchange in the Earth Sciences

Data Citation and Scholarship

GEOSS Data Management Principles: Importance and Implementation

DataCite Persistent links to scientific data. Jan Brase

The Research Data Alliance Creating the culture and technology for an international data infrastructure

Next Generation Semantic Data Environments (or Linked Data, Semantics, and Standards in Scientific Applications)

Transcription:

Implementation of Open-World, Integrative, Transparent, Collaborative Research Data Platforms: the University of Things (UoT) Prof. Peter Fox (pfox@cs.rpi.edu, @taswegian, #twcrpi, ORCID: 0000-0002-1009-7163) Tetherless World Constellation Chair, Earth and Environmental Science/ Computer Science/ Cognitive Science/ IT and Web Science Rensselaer Polytechnic Institute, Troy, NY USA And the Deep Carbon Observatory Data Science Team CGA, Harvard, April 27, 2018

What to expect Inevitable context, history + perspective Deep Carbon Observatory (Integration and Collaboration) Data Science Platform for an international science community Lots of RED, <WHITE> and BLACK data.rpi.edu V2 (Integration and Transparency) Where we are headed Integration, Transparency and Collaboration University Infrastructure for Data Science

Working premise :== Mission Statement Scientists actually ANYONE - should be able to access a global, distributed knowledge base of scientific data and information that: appears to be integrated appears to be locally available is in a language (written, programming, or science) that is understandable and can be shared Data intensive volume, complexity, mode, scale, heterogeneity, in an OPEN WORLD

Deep Carbon Observatory (DCO) We are dedicated to achieving transformational understanding of carbon s chemical and biological roles in Earth. www.deepcarbon.net

Collaboration and Integration needs Enable DCO team leaders to create new groups and associate a number of content types --- documents, discussions, blog posts, tasks, links, and bibliographic entries --- with the group, as well as simple event management (a private event calendar for the group) and embedding of external services (e.g. and esp. Google Calendar) more (data, publications, projects) a Knowledge Network and a Virtual Organization (> 1000 people)

Producers Consumers Experience Data Information Knowledge Creation Gathering Presentation Organization Integration Conversation Ecosystem metaphor Context 6

-> DCO Data Science Platform 2012

Science Network of Things (Objects) 2012 deepcarbon.net info.deepcarbon.net data.deepcarbon.net dx.deepcarbon.net

2012

2015

Dataset Browser, People, Field Sites 2015

All information is linked and traceable! 2013 12

2014

State to date 2014 Knowledge network implements both the collaboration and the integration, reporting implements the transparency It s being USED Many means of population User generation Machine generation Contributing these enhancements back to opensource communities (CKAN, VIVO)

There s more Jupyter notebooks on top 2016 And this: https://news.rpi.edu/content/2018/04/23/applying-network-analysis-natural-history

data.rpi.edu (V2) 2013

Insert data.rpi screen shots 2013

2013

Internal transparency and integration 2013

Thus Integrative semantics Transparent semantics Collaborative semantics Application integration Yep semantics So where are we headed?

Research-grade but not University-grade Adoption of RDA outputs/ recommendations Data Type Registry Permanent ID Types Dynamic Data Citation* Scholix* Improvements to VIVO Science network of things CIOs approach We only run the applications we know how to run Library (not a research library) Helped to start Hurt in University adoption (hope^) * underway ^ New Library Director

Progress toward a University of Things pfox@cs.rpi.edu and the DCO Data Science Team @taswegian #twcrpi 2018 http://tw.rpi.edu http://tw.rpi.edu/web/project/dco-ds http://deepcarbon.net

Garden shed

Framework v. systems v. platforms Rough definitions Systems have very well-define entry and exit points. A user tends to know when they are using one. Options for extensions are limited and usually require engineering Frameworks have many entry and use points. A user often does not know when they are using one. Extension points are part of the design Platforms ~ arise from frameworks Tetherless World Constellation 24

VIVO Extension: Shibboleth Single-Sign-On 2013

Begin YES Need DCO-ID? Generate & register DCO-ID (unique suffix, blank URL) Collect CKAN metadata & generate URL VIVO Extension: Dataset deposit in attached data repository Review DCO-ID & CKAN metadata NO Update DCO-ID (map the DCO-ID to CKAN URL) Revise CKAN metadata DCO-ID & DCO-ID metadata Deposited DCO data or URL to external data YES Includes multi-level metadata collection NO Includes persistent NO identifier External data Object without data URL Revise metadata Data deposit (DCO-ID generation) YES Add URL (to data in external Includes interaction with repository) dedicated repository OR accepts third-party deposit details URL to the downloadable data Update DCO-ID record End NO YES 2012 Deposit in CKAN & generate URL to data Data Science

We identify everything = DCO-ID 2012 Two part: all objects are issued Handle s, and all published objects are also issued DOIs DCO issues Handles, registration number is 11121 We obtain DOIs from DataCite If it is a person, we support ORCID, ResearcherID, ScopusID, era Commons, etc.. You may see (note EPIC style identifier syntax): http://hdl.handle.net/11121/5676-3964-8313-5126-cc and http://dx.deepcarbon.net/11121/5676-3964-8313-5126-cc E.g. Adding bibliography is easy, just enter the DOIs, or paste a bibtex record, and we do the rest, same for people (ORCID, ResearcherID, etc.) -> open world linked to other sources

2013

VIVO Extension: Retrieval of DOI metadata for publications from CrossRef * Expedites entry of e.g. journal articles by retrieving metadata based on DOI * Preserves author rank 2013

Core and Framework Semantics - Multi-tiered interoperability 2012 Mediation! Mediation! Mediation!