SEAD Data Services. Jim Best Practices in Data Infrastructure Workshop. Cooperative agreement #OCI

Similar documents
8th International Digital Curation Conference January 2013

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

Data publication and discovery with Globus

Paving the Rocky Road Toward Open and FAIR in the Field Sciences

Research Data Repository Interoperability Primer

EUDAT Towards a Collaborative Data Infrastructure

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

FAIR-aligned Scientific Repositories: Essential Infrastructure for Open and FAIR Data

Welcome to the Pure International Conference. Jill Lindmeier HR, Brand and Event Manager Oct 31, 2018

Indiana University Research Technology and the Research Data Alliance

DOIs for Research Data

The iplant Data Commons

RDM, a view from Vancouver

DataONE Enabling Cyberinfrastructure for the Biological, Environmental and Earth Sciences

Research Data Management and Institutional Repositories

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

Software + Services for Data Storage, Management, Discovery, and Re-Use

SHARING YOUR RESEARCH DATA VIA

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

Data Management at NIST

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

Reproducibility and FAIR Data in the Earth and Space Sciences

Sessions 3/4: Member Node Breakouts. John Cobb Matt Jones Laura Moyers 7 July 2013 DataONE Users Group

The Materials Data Facility

PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA

Educating a New Breed of Data Scientists for Scientific Data Management

DataONE: Open Persistent Access to Earth Observational Data

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography

SESAR, IGSN, & a vision for a Repository Portal and Hosted Collection Management

For Attribution: Developing Data Attribution and Citation Practices and Standards

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

Towards a joint service catalogue for e-infrastructure services

Data Replication: Automated move and copy of data. PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013

Demos: DMP Assistant and Dataverse

Data Curation: Technical Challenges Facing Repositories. Brianna Marshall Jan. 9, 2014

Globus Platform Services for Data Publication. Greg Nawrocki University of Chicago & Argonne National Lab GeoDaRRS August 7, 2018

EUDAT- Towards a Global Collaborative Data Infrastructure

Medici for Digital Cultural Heritage Libraries. George Tsouloupas, PhD The LinkSCEEM Project

Science-as-a-Service

EUDAT Common data infrastructure

Research Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh

Data Citation and Scholarship

EUDAT. Towards a pan-european Collaborative Data Infrastructure

RADAR. Establishing a generic Research Data Repository: RESEARCH DATA REPOSITORY. Dr. Angelina Kraft

Data Curation Profile Movement of Proteins

Data Archival and Dissemination Tools to Support Your Research, Management, and Education

EUDAT. Towards a pan-european Collaborative Data Infrastructure

Implementation of Open-World, Integrative, Transparent, Collaborative Research Data Platforms: the University of Things (UoT)

UC Irvine LAUC-I and Library Staff Research

Reflections on Three Decades in Internet Time

Key cyberinfrastructure elements implemented as RESTful webservices

Striving for efficiency

Big Data infrastructure and tools in libraries

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Wendy Thomas Minnesota Population Center NADDI 2014

Data Discovery - Introduction

re3data.org - Making research data repositories visible and discoverable

FREYA Connected Open Identifiers for Discovery, Access and Use of Research Resources

Part 2: Current State of OAR Interoperability. Towards Repository Interoperability Berlin 10 Workshop 6 November 2012

Global Data Sharing The Research Data Alliance

The Materials Commons A Novel Information Repository and Collaboration Platform for the Materials Community

Data Curation Profile Plant Genomics

DIGITAL ARCHIVES & PRESERVATION SYSTEMS

EUDAT & SeaDataCloud

BPMN Processes for machine-actionable DMPs

Research Data Sharing Framework to Enhance Open Science

idigbio TCN IT Needs Assessment Results José Fortes, Shari Ellis, Andréa Matsunaga

University at Buffalo's NEES Equipment Site. Data Management. Jason P. Hanley IT Services Manager

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

B2FIND: EUDAT Metadata Service. Daan Broeder, et al. EUDAT Metadata Task Force

Introducing Fedora 4. Overview, examples, and features. David Wilcox,

Developing a Research Data Policy

Dataverse and DataTags

Surveying the Digital Library Landscape

Mercè Crosas, Ph.D. Chief Data Science and Technology Officer Institute for Quantitative Social Science (IQSS) Harvard

Reproducibility and Reuse of Scientific Code Evolving the Role and Capabilities of Publishers

White Paper. EVERY THING CONNECTED How Web Object Technology Is Putting Every Physical Thing On The Web

The DOI Identifier. Drexel University. From the SelectedWorks of James Gross. James Gross, Drexel University. June 4, 2012


Update on Dataverse Dryad-Dataverse Community Meeting. Mercè Crosas, Elizabeth Quigley & Eleni Castro. Data Science > IQSS > Harvard University

Controlled Vocabularies for Scientific Data: Users and Desired Functionalities

Crowdsourcing Codebook Enhancements A DDI-based Approach

EUDAT. Towards a pan-european Collaborative Data Infrastructure

The Experimental Project of DOI Registration for Research Data at Japan Link Center (JaLC)

EUDAT Training 2 nd EUDAT Conference, Rome October 28 th Introduction, Vision and Architecture. Giuseppe Fiameni CINECA Rob Baxter EPCC EUDAT members

RDMG Research Data Management Group at Freiburg University

National Materials Data Initiatives

COALITION ON PUBLISHING DATA IN THE EARTH AND SPACE SCIENCES: A MODEL TO ADVANCE LEADING DATA PRACTICES IN SCHOLARLY PUBLISHING. Source: NSF.

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

MAPR DATA GOVERNANCE WITHOUT COMPROMISE

DMPonline / DaMaRO Workshop. Rewley House, Oxford 28 June DataStage. - a simple file management system. David Shotton

What You Will Learn. What You Will Learn. How to Get Started with Wistia & 5 Ways It Generates More Leads. with Josh White

Midwest Big Data Hub Accelerating the Big Data Innovation Ecosystem

Developing Data Management Plans (DMP) Scholarly Communication Initiative Mississippi State University Libraries March 25, 2015

Fedora Commons Update

Engaging and Connecting Faculty:

Data Curation Profile Human Genomics

Enhancing discovery with entity reconciliation: Use cases from the Linked Data for Libraries (LD4L) project

Transcription:

SEAD Data Services Jim Myers(myersjd@umich.edu), Best Practices in Data Infrastructure Workshop Cooperative agreement #OCI0940824

SEAD: Sustainable Environment - Actionable Data An NSF DataNet project started in October, 2011 An international resource for sustainability science A provider of light-weight Data Services based on novel technical and business approaches: Supporting the long-tail of research Enabling active and social curation Providing integrated lifecycle support for data http://sead-data.net/ Margaret Hedstrom, PI Praveen Kumar, co-pi Jim Myers, co-pi Beth Plale, co-pi

Why do we need Data Patriotism? And so, my fellow Americans, ask not what your data can do for you, ask what you can do for your data The future needs your data! Only you can prevent data loss!

Imagine Having to Ask: Why Acquire Data, Analyze It, and Draw Conclusions?? Funding agencies urging researchers to acquire data!?! Publishers mandating that authors analyze their data?!! An outcry for scientists to put more effort into draw conclusions to help drive progress and economic growth!?!

SEAD Overarching Concepts Self-service Simple basic defaults, rich customization, branding, and automation options Leverage incremental, informal active use to capture data and metadata from first sources Provide data-related (metadata-driven) services to active producers, curators, and users of data Simplify and automate curation and preservation processes using captured information and context Leverage existing institutional repository technologies and organizations to provide long-term storage Increase Value, Lower Costs, Increase Immediacy

SEAD Data Services: Secure Project Spaces Where your project can request a secure, branded Project Space in the cloud

SEAD Data Services: Collaborative Data Management Drag-and-drop your data, or upload large collections from disk

SEAD Data Services: Annotation and Collaboration Add Tags, Formal Terms (configurable), comments Use them to search, coordinate with colleagues

SEAD Data Services Context, Provenance, and Relationships Create relationships Or let your software write them! - Restful API - Library Cited in Has Correction Uses Calibration Generated Using Uses Procedure http://myproj.org/wiki/p7

SEAD Data Services Finding a Suitable Repository Find a Repository And click to publish! Cloud/File storage for large collections (# files or total size), Institutional Repositories for moderate collections IU SDA

SEAD Data Services: Publishing and Discovery Publication with SEAD means: http://dx.doi.org/10.5072/fk2ff3pk7w Persistent Data ID (i.e.doi) Multiple Repository options, including lightweight, standards-based package for long-term storage (BagIT, OAI- ORE, JSON-LD) Registration with DataOne Catalog Branded Published Data page to link in your website

SEAD Interacts with: Projects & their websites Authentication services (Google, ORCID, local, ) Researcher Profile Services (ORCID, (VIVO), ) Data Sources (TerraPop, NEON, any, ) Data Processors (BrownDog, Geoserver, image/video players, ) Repositories (Dspace, Fedora, Cloud, openicpsr, ) Discovery Services (DataOne, DataCite, ) Applications/Services (R, ECube Geosemantics, VIC/DFC, ) -- without deep agreement on architectural/model details -- with mechanisms to help interoperability/synthesis

SEAD as NDS Infrastructure: Apps Browser SEAD Project Spaces Sharing, Curation, Publication, Reuse Web App/REST API Earth Cube Geosemantics Services Computational Environments Data/MD Tools BrownDog Svcs (RabbitMQ) Domain Cyberinfrastructures SEAD Publishing Matchmaking, Publishing, CI Ecosystem Integration (Profiles, papers, catalogs, events, provenance, ) REST API Profiles/Pubs Long-term Repositories NDS/Reference Publisher DOI Landing Pages Webapp Publishing Agent App File System/Cloud Store Third-Party Repositories Profile Sources, SSO Catalogs (DataOne, )

SEAD as a Data Source: Initial Communities: Sustainability Research (Ecological, Social) Centers to grad students and county park managers 2-3 M files, 2 TB+, 20+ groups Rescues & new data 1 file to 135K files per publication Basic metadata to rich provenance and documentation Longer Term: Open long tail of research projects needing hosted data services core share/curate/publish capabilities for custom CI Observational, Experimental, Modeling, Analytics Data across disciplines Rich publications (e.g. for collaboration, integration, reproducibility)

SEAD Data Services: Best Practices Technical Design Semantic Content Management Multi-service architecture Infrastructure Open RESTful APIs Interoperable Data Publications (ORE, BagIT+ : Join the discussion/work at NDS!) Socio-economic Active/Social Curation positive ROI for all stakeholders ROI-justified capabilities Agile Development enabling users not demos Self-service/self-control