National Materials Data Initiatives

Similar documents
Town Hall Meeting on the National and International Data Infrastructure. Jim Warren, NIST Lisa Friedersdorf, NNCO

The Materials Genome Initiative and the Data Infrastructure

ICME: Status & Perspectives

A Perspective on MGI Networking

Callicott, Burton B, Scherer, David, Wesolek, Andrew. Published by Purdue University Press. For additional information about this book

Reproducibility and FAIR Data in the Earth and Space Sciences

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

Welcome to the Pure International Conference. Jill Lindmeier HR, Brand and Event Manager Oct 31, 2018

Data Management at NIST

GEOSS Data Management Principles: Importance and Implementation

Indiana University Research Technology and the Research Data Alliance

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

Mercè Crosas, Ph.D. Chief Data Science and Technology Officer Institute for Quantitative Social Science (IQSS) Harvard

The Materials Commons A Novel Information Repository and Collaboration Platform for the Materials Community

FAIR-aligned Scientific Repositories: Essential Infrastructure for Open and FAIR Data

The Canadian Information Network for Research in the Social Sciences and Humanities.

Towards FAIRness: some reflections from an Earth Science perspective

Dataverse and DataTags

Data Curation Profile Human Genomics

The NIH Big Data to Knowledge Initiative: Raising the Prominence of Data

For Attribution: Developing Data Attribution and Citation Practices and Standards

EGI federated e-infrastructure, a building block for the Open Science Commons

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

How to use Water Data to Produce Knowledge: Data Sharing with the CUAHSI Water Data Center

Open data and scientific reproducibility

Launching the. Data Curation Network NDS/MBDH 2018

Data publication and discovery with Globus

Writing a Data Management Plan A guide for the perplexed

The library s role in promoting the sharing of scientific research data

Assessing the FAIRness of Datasets in Trustworthy Digital Repositories: a 5 star scale

Paving the Rocky Road Toward Open and FAIR in the Field Sciences

Reproducibility and Reuse of Scientific Code Evolving the Role and Capabilities of Publishers

DATA MANAGEMENT PLANS Requirements and Recommendations for H2020 Projects. Matthias Razum April 20, 2018

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services

Data Curation Profile Movement of Proteins

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

I data set della ricerca ed il progetto EUDAT

TEXT MINING: THE NEXT DATA FRONTIER

Cyber Secure Dashboard Cyber Insurance Portfolio Analysis of Risk (CIPAR) Cyber insurance Legal Analytics Database (CLAD)

Jisc Research Data Shared Service

re3data.org - Making research data repositories visible and discoverable

How to make your data open

The Materials Data Facility

Striving for efficiency

Digital repositories as research infrastructure: a UK perspective

RADAR A Repository for Long Tail Data

VI-SEEM Data Repository. Presented by: Panayiotis Charalambous

Match data set availability to data resource requirements, including gap analysis and remediation assistance.

Engaging and Connecting Faculty:

The Semantic Institution: An Agenda for Publishing Authoritative Scholarly Facts. Leslie Carr

Making research data repositories visible and discoverable. Robert Ulrich Karlsruhe Institute of Technology

Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing Trusted Digital Repositories

Developing a Research Data Policy

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

NOMAD Metadata for all

SHARING YOUR RESEARCH DATA VIA

Database Infrastructure to Support Knowledge Management in Physicochemical Data - Application in NIST/TRC SOURCE Data System

Data Curation Profile Plant Genomics

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment

Research Data Preserve, Share, Reuse, Publish, or Perish Mark Gahegan Director, Centre for eresearch 24 Symonds St

Web of Science. Platform Release Nina Chang Product Release Date: March 25, 2018 EXTERNAL RELEASE DOCUMENTATION

Data management Backgrounds and steps to implementation; A pragmatic approach.

Feed the Future Innovation Lab for Peanut (Peanut Innovation Lab) Data Management Plan Version:

The Data Curation Profiles Toolkit: Interview Worksheet

Supporting Data Stewardship Throughout the Data Life Cycle in the Solid Earth Sciences

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Deliverable 6.4. Initial Data Management Plan. RINGO (GA no ) PUBLIC; R. Readiness of ICOS for Necessities of integrated Global Observations

Data Management Checklist

UC Irvine LAUC-I and Library Staff Research

European Open Science Cloud Implementation roadmap: translating the vision into practice. September 2018

EUDAT. Towards a pan-european Collaborative Data Infrastructure

EUDAT and Cloud Services

Building on Existing Communities: the Virtual Astronomical Observatory (and NIST)

BHL-EUROPE: Biodiversity Heritage Library for Europe. Jana Hoffmann, Henning Scholz

The Ohio State University's Knowledge Bank: An Institutional Repository in Practice

28 September PI: John Chip Breier, Ph.D. Applied Ocean Physics & Engineering Woods Hole Oceanographic Institution

Introduction to Data Management for Ocean Science Research

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

The Curator s Approach to Data Management and Sustainability

National Science and Technology Council. Interagency Working Group on Digital Data

DuraSpace FAIRness and GDPR

Developing the ERS Collaboration Framework

The Model-Driven Semantic Web Emerging Standards & Technologies

e-infrastructures in FP7 INFO DAY - Paris

Making data publication a first class research output

Materials Registry Working Group Meeting

The RMap Project: Linking the Products of Research and Scholarly Communication Tim DiLauro

National Data Sharing and Accessibility Policy-2012 (NDSAP-2012)

OpenAIRE From Pilot to Service The Open Knowledge Infrastructure for Europe

Data Management Plans. Sarah Jones Digital Curation Centre, Glasgow

Data Curation Profile Cornell University, Biophysics

Enabling the Future of Connectivity. HITEC 2016 Tech Talk

EUDAT. Towards a pan-european Collaborative Data Infrastructure

Advancing code and data publication and peer review. Erika Pastrana, PhD Executive Editor, Nature Journals ALPSP_Sept 2018

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Evolving the digital library for digital scholarship enablement

Data Curation Profile Food Technology and Processing / Food Preservation

SEAD Data Services. Jim Best Practices in Data Infrastructure Workshop. Cooperative agreement #OCI

Transcription:

National Materials Data Initiatives Chuck Ward Integrity Service Excellence Materials & Manufacturing Directorate Approved for public release, distribution is unlimited. 88ABW-2015-2270

Overview Policy drivers Discipline-based approaches Repository efforts Collaborative environments Supporting technology/efforts We don t have a big data problem... we have a big problem with data. 2

OSTP Direction on Public Access Make publically available the products of research supported wholly or in part by Federal funding: Peer-reviewed scholarly publications Digitally formatted scientific data should be stored and publicly accessible to search, retrieve, and analyze 3

Agency Public Access Plans Digitally Formatted Scientific Data Should be stored and publicly accessible Data management plans (DMP) with proposal Deposit data per proposal DMP Should be stored and publicly accessible DMP with proposal Deposit data per publication policy or proposal DMP Should be stored and publicly accessible DMP with proposal Store data in public repositories, centralized data catalog/locator at DTIC 4

General DMP Help is Available 5

Typical Agency DMP Elements DMPs are key to defining the who, what, where, how, and why associated with the management of research data Expected data Preservation criteria File formats Data & metadata standards Data storage Dissemination Long-term access Ethics/privacy/proprietary/IP/security Agency DMP requirements are necessary but insufficient to alone ensure the reusability of published research data 6

Agency Based Support http://www.commondataelements.ninds.nih.gov/ 7

Community Based Support 8

Community Based Support http://isatab.sourceforge.net/tools.html 9

Materials Genome Initiative Leading a culture shift in materials research Integrating experiment, computation, and theory CMS AMM PMS&C DMREF ICMSE CAMD ICME Making digital data accessible Creating a world-class materials workforce Enabling the discovery, development, manufacturing, and deployment of advanced materials at least twice as fast as possible today, at a fraction of the cost.

Requirements for a Materials Data Infrastructure Repositories for materials data Accessible, federated, affordable Standards for data exchange Formats, data and metadata standards, vocabularies/ontologies Data quality metrics Pedigree, provenance, verification, validation, uncertainty, sensitivity Citation and attribution protocols Persistent identification Intellectual property and liability determinations Export control education, Licensing clarity Findable, Accessible, Interoperable, Reusable

MGI Data Workshop, July 2014 Priorities Community Develop and deploy standards for data and federated/collaborative environments Communicate value and need for digital materials data Define critical data to be compiled Engage Community Explore and leverage data solutions developed outside materials community Train students in these emerging areas Establish community norms for publishing materials data Government Lead data strategy/approach development Incentivize data deposition Facilitate data deposition (policy and technology) Support data sharing and deposition (provide resources) Support data creation Enhance data management competency Provide quality datasets for model development/validation http://dx.doi.org/10.6028/nist.ir.8038 12

Agency Data Gateways http://maptis.nasa.gov/ http://www.osti.gov/dataexplorer/ Very rudimentary Limited data available Restricted access Engineering-oriented Extensive collection 13

NIST Materials Data Repository https://materialsdata.nist.gov/dspace/xmlui/

Structural Data Demonstration Project: Al6061 Goal: Establish well-pedigreed and curated demonstration datasets for non-proprietary metallic structural materials data over all length scales. March 2014: Phase 1 release. June 2014: Phase 2 release. Dec 2014: Project Completion NIST s role Provide data schemas and meta-data formats for diffusion and phase equilibria data. Provide sample diffusion and phase equilibria data for the Al-Mg-Si system. Use expanded TRC Guided Data Capture program with available binary and ternary phase equilibria literature Expand use and implementation of DSpace Repository Link with developing ontology and semantic web tools

DOE Vehicle Technologies Supporting the MII Objectives: Generate thermodynamic, kinetic, and corrosion data for automotive Mg die casting alloys to fill significant gaps in the reported properties and to enable design of high performance alloys. Partner with NIST to structure data and deliver via NIST Dspace repository. PI: J. Allison Coupled modeling and experiment to determine liquid- and solid-state kinetics in die castings DOE Funding: $600k PI: K. Sieradzki Synthetic microstructures and atomistic modeling to explore microstructure effects on bulk corrosion DOE Funding: $500k PI: J.C. Zhao, A. Luo High-throughput measurement of binary, ternary, and quaternary Mg alloy kinetics in liquid and solid DOE Funding: $600k PI: M. Horstemeyer Model development and experimental validation for coupled H 2 evolution and corrosion damage model DOE Funding: $500k PI: A. Rohatgi Dynamic-TEM measurement of Mg liquid- and solidstate kinetics with ~500ns resolution DOE Funding: $500k PI: G. Song Systematic experimental test of passivation behavior for wide range of Mg-X solid solutions DOE Funding: $600k All PIs will work with NIST to determine best format, content, and meta-data All PIs will upload project data to a NIST repository where it will be publicly available, searchable, useable, etc. All data will be assigned a persistent identifier for citation and connection with publications 16 Vehicle Technologies Program eere.energy.gov

National Data Service National Center for Supercomputing Applications http://www.nationaldataservice.org 17

Project-based repositories http://www.mrl.ucsb.edu:8080/datamine/thermoelectrics.jsp http://hydrogenmaterialssearch.govtools.us/ Issues: data discoverability, machine processing, data curation/longevity 18

Computationally-derived Data Repositories MIT & LBNL 58,123 compounds 41952 band structures 1,243 elastic tensors Thermodynamic props Crystal structure Phase diagrams REST API Duke 51,367 compounds 308,975 calculations Thermodynamic props Magnetic props Crystal structure Phase diagrams REST API 19

Collaborative Environments Univ. Michigan Materials Commons Store materials data (& provenance) Collaborate and share data Search and use data, REST API Univ. Illinois T2C2 Curator, real-time acquisition and curation of data from instruments T2C2 Coordinator, filter, identify correlate data Timely and Trustworthy Curating and Coordinating Data Framework (T2C2) Purdue University Platform for simulations Project space Expanding to data management Georgia Tech e-collaboration space Models, data, sharing GitHub, Plot.ly, figshare, Authorea 20

ASM International: Structural Data Demonstration Project DSpace DOE/EERE Kinetics of Cast Mg Alloys Journals collaboration IMMI Others under discussion Workflow Tools Ontologies Schemas (XML based) NLP Semantic Media Wiki Meta data standards Coordinated by new Office of Data and Informatics Uncertainty Analysis Data Analytics Data Mining Tools Bench marking activities (DFT)

ChiMaD Data Mining 22

AFRL Data Efforts Wright State University Metals Affordability Initiative Web-based OEM-Supplier collaboration framework Data standards and protocols for OEM- Supplier data and model exchange. Uncertainty Quantification methods to capture variability in process, microstructure and property data. http://wiki.knoesis.org/index.php/materialways 23

Semantic-based Tools for Materials Data Management Develop Robust Mid-level Materials Ontology Ready for Crowd-sourcing and Initial Experimental Research Use Explore Sophisticated Low-level Ontologies Significantly Expand Size of, and Integration Across Data Stores Propose and Demonstrate Computational Approaches for Establishing Provenance Provide Access Control for Linked Materials data and Computational Functions Integrate with Manufacturing or Component Design Domains 24

Materials Data Formats 25

DoD Materials Project on SEM Data Problem: Data generated from scanning electron microscopy (SEM) have become much more complex and content rich, but there is no standard format used to facilitate sharing and machine interpretation Objectives: Better preservation and re-use of SEM data across the services Improve automatic metadata capture Define and improve data flow from research instruments to analysis

Making Data More Valuable & Accessible Supporting Reports of Research with Underlying Data http://www.immijournal.com/authors/instructions/datadescriptorarticle 27

Many Players in the Field 28

Summary A few materials data repositories have been established Development and application of commonly accepted standards and tools enabling data discovery, data quality, and reusability is needed Educational resources on good data practices are needed for the community Community-wide acceptance of the need for materials data stewardship is slow to build 29

30