Data Symposium 2012 SeWHIP & CTSI John W. Cobb, Ph.D. Milwaukee, WI March 1, 2012

Size: px
Start display at page:

Download "Data Symposium 2012 SeWHIP & CTSI John W. Cobb, Ph.D. Milwaukee, WI March 1, 2012"

Transcription

1 : Some Lessons Learned Data Symposium 2012 SeWHIP & CTSI John W. Cobb, Ph.D. Milwaukee, WI March 1, 2012

2 Acknowledgement and collaborators DataONE Cal Dig. Lib. Cornell Lab of Ornithology TeraGrid (now XSEDE) National Science Foundation 2 Managed by UT-Battelle

3 Outline Distinctions between cyberinfrastructure, information technology, computer science, computational science a personal view Requirements analysis for interoperable long-term data archive and curation service (a datanet) with focus on ecological, biological, and environmental sciences Description of DataONE datanet services and related services as an exemplar (bulk of talk) 3 Managed by UT-Battelle

4 Outline Distinctions between cyberinfrastructure, information technology, computer science, computational science a personal view Requirements analysis for interoperable long-term data archive and curation service (a datanet) with focus on ecological, biological, and environmental sciences Description of DataONE datanet services and related services as an exemplar (bulk of talk) 4 Managed by UT-Battelle

5 Data driven science A supercomputer is just one more source of petabytes of data Rorshach test: What is This? A computing center or a data center? Images courtesy of the NCCS, ORNL 5 Managed by UT-Battelle

6 A (personal) taxonomy of computing research activities Computer Science (CS) Algorithm development Compu6ng architecture development Language research Cyberinfrastructure (CI) Developing methods for accessing compu6ng services Virtual organiza6on research Computa6onal Science (Computa6onal Sci) PDE solver algorithms Finite math representa6ons of con6nous PDE s Numerical linear algebra Informa6on Technology (IT) Research about IT opera6ons Provisioning Produc6on networking (internal and external connec6ons) Cybersecurity In addition to research efforts IT, CI, and Computational science often have coupled operational tasks. 6 Managed by UT-Battelle

7 Contrasts: CI and IT CI Enables research initiatives Expands topline Seeks new services Externally focused Project partner IT Enables enterprise integration Optimizes bottom line Optimizes current services Internally focused Project component However, pragmatically, university CIO s often are charged with CI and IT responsibilities as well as a partner in computational science provisioning. Usually computer science is not under CIO function. Library science is also separate although occasionally academic library operations sometimes also fall under the CIO function. Some CI and IT services can look quite similar (system administration, network security, ) 7 Managed by UT-Battelle

8 My focus today: CI Like the physical infrastructure of roads, bridges, power grids, telephone lines, and water systems that support modern society, cyberinfrastructure refers to the distributed computer, information and communication technologies combined with the personnel and integrating components that provide a long-term platform to empower the modern scientific research endeavor. (Atkins Report, NSF Blue-Ribbon Advisory Panel on Cyberinfrastructure 2003) 8 Managed by UT-Battelle

9 Outline Distinctions between cyberinfrastructure, information technology, computer science, computational science a personal view Requirements analysis for interoperable long-term data archive and curation service (a datanet) with focus on ecological, biological, and environmental sciences Description of DataONE datanet services and related services as an exemplar (bulk of talk) 9 Managed by UT-Battelle

10 What is needed to build a robust set of data services? Archiving Management Metadata regularization Curation Retention Practioner education Socio-cultural barriers 10 Managed by UT-Battelle

11 Information Content Poor data practice data entropy Accident Time of publication Specific details General details In what sense is modern science reproducible? cf. Brian Athey and Clifford Lynch this morning Retirement or career change Death Time (Michener et al. 1997) 11 Managed by UT-Battelle

12 Data management plans now required NSF Data Management Plan Requirements Beginning January 18, 2011, proposals submitted to NSF must include a supplementary document of no more than two pages labeled "Data Management Plan" (DMP). This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results. Proposals that do not include a DMP will not be able to be submitted. Other agencies have or are instituting requirements 12 Managed by UT-Battelle

13 Data deluge and interoperability the flood of increasingly heterogeneous data Data are heterogeneous Syntax (format) Schema (model) Semantics (meaning) By hand is timeconsuming and brittle Jones et al Managed by UT-Battelle

14 Baseline assessment: scientists (2010) Demographics Discipline Work Sector Tenopir, C, Allard S, Douglass K, Aydinoglu AU, Wu L, Read E, Manoff M, Frame M Data Sharing by Scientists: Practices and Perceptions. PLoS ONE. 6(6) medicine 2% ecology 18% other 7% social sciences 16% computer science/ engineering 9% commercial 2% government 13% non-profit 3% other 2% biology 14% physical sciences 12% academic 80% atmospheric science 4% 14 Managed by UT-Battelle n=1317 environmental sciences 18% n=1315

15 What standard do you currently use? DIF DwC DC EML FGDC Open GIS Metadata language ISO My Lab none 15 Managed by UT-Battelle

16 Many are interested in sharing data Willing to share data across a broad group of researchers Willing to place at least some of my data into a central data repository with no restric6ons Appropriate to create new datasets from shared data 81% 78% 76% Willing to place all of my data into a central data repository with no restric6ons 41% 16 Managed by UT-Battelle 0% 20% 40% 60% 80% 100% Percent agree

17 Needs of scientists: the data lifecycle Will I get credit for my work? Collect What is a data management plan? Analyze Assure What tools do I use? What is metadata? Integrate Describe Who can help me? Discover Deposit How much will it cost? 17 Managed by UT-Battelle Where do I preserve my data? Preserve How do I preserve my data?

18 ebird pilot project exploration and visualization Diverse bird observa6ons and environmental data from 300,00 loca6ons in the US integrated and analyzed using High Performance Compu6ng Resources Model results Occurrence of Indigo Bun=ng (2008) Land Cover Meteorology MODIS Remote sensing data Spa6o- Temporal Exploratory Model iden6fies factors affec6ng pa^erns of migra6on Jan Apr Jun Sep Dec Examine pa^erns of migra6on Infer how climate change may affect bird migra6on 18 Managed by UT-Battelle

19 Secretary Salazar on Birds (May 3, 2011): The State of the Birds report is a measurable indicator of how well we are fulfilling our shared role as stewards of our nation s public lands and waters. Acadian Flycatcher Distribution ebird.org 19 Managed by UT-Battelle 19

20 Multiple Scales Building the Knowledge Pyramid 90:10 à 10:90 Decreasing Spatial Coverage Increasing Process Knowledge Intensive science sites and experiments Extensive science sites Volunteer & education networks Remote sensing 20 Managed by UT-Battelle Adapted from CENR-OSTP

21 Tracing requirements Multiple scales Interoperable across repositories Cross organizational (VO s) Multiple identities Data heterogeneity Manage disparate rights policies Support all phases of the data life cycle Include education and outreach to change community practies 21 Managed by UT-Battelle

22 Outline Distinctions between cyberinfrastructure, information technology, computer science, computational science a personal view Requirements analysis for interoperable long-term data archive and curation service (a datanet) with focus on ecological, biological, and environmental sciences Description of DataONE datanet services and related services as an exemplar (bulk of talk) 22 Managed by UT-Battelle

23 DataONE Movie (with Sound) 23 Managed by UT-Battelle

24 DataONE is Cyberinfrastructure Three major components form a flexible, scalable, sustainable network Member Nodes diverse institutions Coordinating Nodes serve local community retain complete Investigator provide resources Toolkit for metadata catalog managing their data indexing for search retain copies of data network-wide services ensure content availability (preservation) replication services Source: DataONE/Michener 24 Managed by UT-Battelle

25 Examples of data holdings Metadata Interoperability Across Data Holdings Data Archive Types of Data Managed Biodiversity, taxonomic, ecological Biogeochemical dynamics, terrestrial ecological Earth observation imagery Ecological, biodiversity, biophysical, social, genomics, and taxonomic Avian populations and molecular biology Metadata Standard(s) BDP, DwC, DC, OGIS DIF, BDP, ECHO EML DwC Biological and taxonomic Biophysical, biodiversity, disturbance, and Earth observation imagery Biodiversity, biotic structure, function/ process, biogeochemical, climate, and hydrologic DC subset EML EML BDP=Biological Data Profile DC subset=dublin Core subset DwC=Darwin Core DC=Dublin Core 25 Managed by UT-Battelle EML=Ecological Metadata Language DIF=Directory Interchange Format ECHO=EOS ClearingHOuse OGIS=OpenGIS 25" 25

26 Initial member nodes ORNL- DAAC Dryad KNB Community Agency repository Journal consor6um Research network Data Size Services Ecology and biogeochemical dynamics 900 data products, ~ 1 TB Tools for data preservation, replication, discovery, access, subsetting, and visualization Biosciences ~ 1,000 data products, ~ 3 GB Tools for data preservation, replication, discovery and access Metadata stds. FGDC subset Dublin Core applica6on profile Degree of cura6on High Medium Low Data submission Agency-approved, staffassisted submission and curation of final data product Web-based data submission at time of journal article submission Sponsor NASA NSF/JISC, socie6es, publishers Biodiversity, ecology, environment 20,000 data products, 100s GBs Tools for data preservation, replication, discovery, access, management, and visualization EML, FGDC Self-submission via desktop tool at any time NSF 26 Managed by UT-Battelle

27 Preserve Data and Metadata Metadata copied to Coordinating Nodes Mirrored between CNs Data replicated between Member Nodes CNs manage copies Checksums recorded Promote quality metadata * 27 Managed by UT-Battelle

Metadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE

Metadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE Metadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE eurocris September 9, 2013 Outline Data Challenges Metadata Solu=on DataONE addressing the Data Challenge Enabling Scien=fic Discovery

More information

DataONE Cyberinfrastructure. Ma# Jones Dave Vieglais Bruce Wilson

DataONE Cyberinfrastructure. Ma# Jones Dave Vieglais Bruce Wilson DataONE Cyberinfrastructure Ma# Jones Dave Vieglais Bruce Wilson Foremost a Federa9on Member Nodes (MNs) Heart of the federa9on Harness the power of local cura9on Coordina9ng Nodes (CNs) Services to link

More information

DataONE Enabling Cyberinfrastructure for the Biological, Environmental and Earth Sciences

DataONE Enabling Cyberinfrastructure for the Biological, Environmental and Earth Sciences DataONE Enabling Cyberinfrastructure for the Biological, Environmental and Earth Sciences William K. Michener 1,2, Rebecca Koskela 1,2, Matthew B. Jones 2,3, Robert B. Cook 2,4, Mike Frame 2,5, Bruce Wilson

More information

Key cyberinfrastructure elements implemented as RESTful webservices

Key cyberinfrastructure elements implemented as RESTful webservices Key cyberinfrastructure elements implemented as RESTful webservices Investigator Toolkit Web Interface Analysis, Visualization Data Management Client Libraries Java Python Command Line Member Nodes Service

More information

DataONE: Open Persistent Access to Earth Observational Data

DataONE: Open Persistent Access to Earth Observational Data Open Persistent Access to al Robert J. Sandusky, UIC University of Illinois at Chicago The Net Partners Update: ONE and the Conservancy December 14, 2009 Outline NSF s Net Program ONE Introduction Motivating

More information

Sessions 3/4: Member Node Breakouts. John Cobb Matt Jones Laura Moyers 7 July 2013 DataONE Users Group

Sessions 3/4: Member Node Breakouts. John Cobb Matt Jones Laura Moyers 7 July 2013 DataONE Users Group Sessions 3/4: Member Node Breakouts John Cobb Matt Jones Laura Moyers 7 July 2013 DataONE Users Group Schedule 1:00-2:20 and 2:40-4:00 Member Node Breakouts Member Node Overview and Process Overview Documentation

More information

Cyberinfrastructure!

Cyberinfrastructure! Cyberinfrastructure! David Minor! UC San Diego Libraries! San Diego Supercomputer Center! January 4, 2012! Cyberinfrastructure:! History! Definitions! Examples! History! mid-1990s:! High performance computing

More information

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation, Integration Alan Blatecky Director OCI 1 1 Framing the

More information

International Multidisciplinary Metadata Workshop 18 January Rebecca Koskela Arctic Region Supercomputing Center

International Multidisciplinary Metadata Workshop 18 January Rebecca Koskela Arctic Region Supercomputing Center Metadata: A Means to Manage Ecological Data International Multidisciplinary Metadata Workshop 18 January 2007 Rebecca Koskela Arctic Region Supercomputing Center Why Should You Create Metadata? Data Entropy

More information

Envisioning a New Distributed Organization and Cyberinfrastructure to Enable Science

Envisioning a New Distributed Organization and Cyberinfrastructure to Enable Science Envisioning a New Distributed Organization and Cyberinfrastructure to Enable Science Stephen Abrams Patricia Cruse John Kunze California Digital Library Outline of today s talk Complexities of global change

More information

Dagmar Triebel, Peter Grobe, Anton Güntsch, Gregor Hagedorn, Joachim Holstein, Carola Söhngen, Claus Weiland, Tanja Weibulat.

Dagmar Triebel, Peter Grobe, Anton Güntsch, Gregor Hagedorn, Joachim Holstein, Carola Söhngen, Claus Weiland, Tanja Weibulat. How to organize, process and archive collection and occurrence data using GFBio services provided by Germany s major natural history and culture collection data repositories, Peter Grobe, Anton Güntsch,

More information

The library s role in promoting the sharing of scientific research data

The library s role in promoting the sharing of scientific research data The library s role in promoting the sharing of scientific research data Katherine Akers Biomedical Research/Research Data Specialist Shiffman Medical Library Wayne State University Funding agency requirements

More information

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development Jeremy Fischer Indiana University 9 September 2014 Citation: Fischer, J.L. 2014. ACCI Recommendations on Long Term

More information

Data Curation Practices at the Oak Ridge National Laboratory Distributed Active Archive Center

Data Curation Practices at the Oak Ridge National Laboratory Distributed Active Archive Center Data Curation Practices at the Oak Ridge National Laboratory Distributed Active Archive Center Robert Cook, DAAC Scientist Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN cookrb@ornl.gov

More information

WP4: Data Forum. Øystein Godøy, Boris Radosavljević, Boris Biskaborn, Anna Irrgang

WP4: Data Forum. Øystein Godøy, Boris Radosavljević, Boris Biskaborn, Anna Irrgang WP4: Data Forum Øystein Godøy, Boris Radosavljević, Boris Biskaborn, Anna Irrgang Motivation INTERACT research stations generate data and metadata Long term monitoring Short term process studies External

More information

Introduction to Data Management for Ocean Science Research

Introduction to Data Management for Ocean Science Research Introduction to Data Management for Ocean Science Research Cyndy Chandler Biological and Chemical Oceanography Data Management Office 12 November 2009 Ocean Acidification Short Course Woods Hole, MA USA

More information

T I F F A N Y C. C H A O G r a d u a t e S c h o o l o f L i b r a r y a n d I n f o r m a t i o n S c i e n c e U n i v e r s i t y o f I l l i n o

T I F F A N Y C. C H A O G r a d u a t e S c h o o l o f L i b r a r y a n d I n f o r m a t i o n S c i e n c e U n i v e r s i t y o f I l l i n o T I F F A N Y C. C H A O G r a d u a t e S c h o o l o f L i b r a r y a n d I n f o r m a t i o n S c i e n c e U n i v e r s i t y o f I l l i n o i s I A S S I S T 2 0 1 3 IASSIST 2013 2 ROADMAP Introduction

More information

Site# Date H20 Temperature Conductance Turbidity KRS Sep KRS Aug KRS Aug

Site# Date H20 Temperature Conductance Turbidity KRS Sep KRS Aug KRS Aug ID ASR_Number Sample_Number QC_Code Analysis_Request_No External_Sample_Number Start_Date 1 1383 892 1 08-Aug-2002 2 1383 902 1 08-Aug-2002 3 1383 912 1 08-Aug-2002 Site# Date H20 Temperature Conductance

More information

Data Archival and Dissemination Tools to Support Your Research, Management, and Education

Data Archival and Dissemination Tools to Support Your Research, Management, and Education Data Archival and Dissemination Tools to Support Your Research, Management, and Education LIZA BRAZIL CUAHSI PRODUCT MANAGER Shout Out: Upcoming Cyberseminars April 13: Liza Brazil, CUAHSI: Data Archiving

More information

Data publication and discovery with Globus

Data publication and discovery with Globus Data publication and discovery with Globus Questions and comments to outreach@globus.org The Globus data publication and discovery services make it easy for institutions and projects to establish collections,

More information

The Data Census: Assessing Data Services at MSU

The Data Census: Assessing Data Services at MSU The Data Census: Assessing Data Services at MSU Sara Mannheimer Data Management Librarian, Montana State University sara.mannheimer@montana.edu @saramannheimer CLIR/DLF E-Research Network October 14, 2015

More information

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center

More information

Enabling Interaction and Quality in a Distributed Data DRIS

Enabling Interaction and Quality in a Distributed Data DRIS Purdue University Purdue e-pubs Libraries Research Publications 5-11-2006 Enabling Interaction and Quality in a Distributed Data DRIS D. Scott Brandt Purdue University, techman@purdue.edu James L. Mullins

More information

Reflections on Three Decades in Internet Time

Reflections on Three Decades in Internet Time This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States of America License. Reflections on Three Decades in Internet Time Christine Borgman, Paul

More information

PNAMP Metadata Builder Prototype Development Summary Report December 17, 2012

PNAMP Metadata Builder Prototype Development Summary Report December 17, 2012 PNAMP Metadata Builder Prototype Development Summary Report December 17, 2012 Overview Metadata documentation is not a commonly embraced activity throughout the region. But without metadata, anyone using

More information

Reproducibility and FAIR Data in the Earth and Space Sciences

Reproducibility and FAIR Data in the Earth and Space Sciences Reproducibility and FAIR Data in the Earth and Space Sciences December 2017 Brooks Hanson Sr. VP, Publications, American Geophysical Union bhanson@agu.org Earth and Space Science is Essential for Society

More information

The CEDA Archive: Data, Services and Infrastructure

The CEDA Archive: Data, Services and Infrastructure The CEDA Archive: Data, Services and Infrastructure Kevin Marsh Centre for Environmental Data Archival (CEDA) www.ceda.ac.uk with thanks to V. Bennett, P. Kershaw, S. Donegan and the rest of the CEDA Team

More information

National Science and Technology Council. Interagency Working Group on Digital Data

National Science and Technology Council. Interagency Working Group on Digital Data National Science and Technology Council Interagency Working Group on Digital Data 1 Interagency Working Group White House Executive Office of the President Office of Science and Technology Policy National

More information

Toward the Development of a Comprehensive Data & Information Management System for THORPEX

Toward the Development of a Comprehensive Data & Information Management System for THORPEX Toward the Development of a Comprehensive Data & Information Management System for THORPEX Mohan Ramamurthy, Unidata Steve Williams, JOSS Jose Meitin, JOSS Karyn Sawyer, JOSS UCAR Office of Programs Boulder,

More information

Data Curation Profile Human Genomics

Data Curation Profile Human Genomics Data Curation Profile Human Genomics Profile Author Profile Author Institution Name Contact J. Carlson N. Brown Purdue University J. Carlson, jrcarlso@purdue.edu Date of Creation October 27, 2009 Date

More information

Data Management Tools. Lizzy Rolando, Georgia Tech Aaron Trehub, Auburn University August 6, 2013

Data Management Tools. Lizzy Rolando, Georgia Tech Aaron Trehub, Auburn University August 6, 2013 Data Management Tools Lizzy Rolando, Georgia Tech Aaron Trehub, Auburn University August 6, 2013 A brief history of how we got here The march of data, 3000 BC 2010 AD 2011-2013 Etc. Kipling on data management

More information

Growing Variety and Volume of Remote Sensing and In Situ Data

Growing Variety and Volume of Remote Sensing and In Situ Data The Potential Role of the World Data Centers in the Global Earth Observing System of Systems and the International Polar Year: CIESIN Experience to Date Dr. Robert S. Chen Director and Senior Research

More information

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography Christopher Crosby, San Diego Supercomputer Center J Ramon Arrowsmith, Arizona State University Chaitan

More information

Writing a Data Management Plan A guide for the perplexed

Writing a Data Management Plan A guide for the perplexed March 29, 2012 Writing a Data Management Plan A guide for the perplexed Agenda Rationale and Motivations for Data Management Plans Data and data structures Metadata and provenance Provisions for privacy,

More information

Managing Ecological and Biodiversity Data Using Ecoinformatics: Taiwan Experience. Chau Chin Lin Taiwan Forestry Research Institute

Managing Ecological and Biodiversity Data Using Ecoinformatics: Taiwan Experience. Chau Chin Lin Taiwan Forestry Research Institute Managing Ecological and Biodiversity Data Using Ecoinformatics: Taiwan Experience Chau Chin Lin Taiwan Forestry Research Institute Persons to Thank First for The Following Presentation Dr. Hen-biau King

More information

Florida Coastal Everglades LTER Program

Florida Coastal Everglades LTER Program Florida Coastal Everglades LTER Program Metadata Workshop April 13, 2007 Linda Powell, FCE Information Manager Workshop Objectives I. Short Introduction to the FCE Metadata Policy What needs to be submitted

More information

Earth Science Community view on Digital Repositories

Earth Science Community view on Digital Repositories Ground European Network for Earth Science Interoperations Digital Repository Dissemination and Exploitation of GRids in Earth science Earth Science Community view on Digital Repositories Luigi FUSCO -

More information

Wendy Thomas Minnesota Population Center NADDI 2014

Wendy Thomas Minnesota Population Center NADDI 2014 Wendy Thomas Minnesota Population Center NADDI 2014 Coverage Problem statement Why are there problems with interoperability with external search, storage and delivery systems Minnesota Population Center

More information

SEAD Data Services. Jim Best Practices in Data Infrastructure Workshop. Cooperative agreement #OCI

SEAD Data Services. Jim Best Practices in Data Infrastructure Workshop. Cooperative agreement #OCI SEAD Data Services Jim Myers(myersjd@umich.edu), Best Practices in Data Infrastructure Workshop Cooperative agreement #OCI0940824 SEAD: Sustainable Environment - Actionable Data An NSF DataNet project

More information

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository Robert R. Downs and Robert S. Chen Center for International Earth Science Information

More information

UC Irvine LAUC-I and Library Staff Research

UC Irvine LAUC-I and Library Staff Research UC Irvine LAUC-I and Library Staff Research Title Research Data Management: Local UCI Outreach to Faculty Permalink https://escholarship.org/uc/item/18f3v1j7 Author Tsang, Daniel C Publication Date 2013-02-25

More information

Developing Data Management Plans (DMP) Scholarly Communication Initiative Mississippi State University Libraries March 25, 2015

Developing Data Management Plans (DMP) Scholarly Communication Initiative Mississippi State University Libraries March 25, 2015 Developing Data Management Plans (DMP) Scholarly Communication Initiative Mississippi State University Libraries March 25, 2015 Overview What s new with data management? Why is the library involved with

More information

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel National Center for Supercomputing Applications University of Illinois

More information

The Data Management Plan: Putting policy into practice Suzanne Clarke Director, Information Resources

The Data Management Plan: Putting policy into practice Suzanne Clarke Director, Information Resources The Data Management Plan: Putting policy into practice Suzanne Clarke Director, Information Resources August 2008 Monash environment High level interest DVC (Research, Prof Edwina Cornish) E-Research Centre

More information

Indiana University Research Technology and the Research Data Alliance

Indiana University Research Technology and the Research Data Alliance Indiana University Research Technology and the Research Data Alliance Rob Quick Manager High Throughput Computing Operations Officer - OSG and SWAMP Board Member - RDA Organizational Assembly RDA Mission

More information

Engaging and Connecting Faculty:

Engaging and Connecting Faculty: Engaging and Connecting Faculty: Research Discovery, Access, Re-use, and Archiving Janet McCue and Jon Corson-Rikert Albert R. Mann Library Cornell University CNI Spring 2007 Task Force Meeting April 16,

More information

How to use Water Data to Produce Knowledge: Data Sharing with the CUAHSI Water Data Center

How to use Water Data to Produce Knowledge: Data Sharing with the CUAHSI Water Data Center How to use Water Data to Produce Knowledge: Data Sharing with the CUAHSI Water Data Center Jon Pollak The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) August 20,

More information

White Paper: National Data Infrastructure for Earth System Science

White Paper: National Data Infrastructure for Earth System Science White Paper: National Data Infrastructure for Earth System Science Reagan W. Moore Arcot Rajasekar Mike Conway University of North Carolina at Chapel Hill Wayne Schroeder Mike Wan University of California,

More information

The State of Arctic Data the IPY experience

The State of Arctic Data the IPY experience The State of Arctic Data the IPY experience Mark A. Parsons,Taco de Bruin, Scott Tomlinson, Øystein Godøy, Helen Campbell, Julie Leclert, Ellsworth LeDrew, David Carlson, and the IPY data community. 22

More information

Building Resilience to Disasters for Sustainable Development: Visakhapatnam Declaration and Plan of Action

Building Resilience to Disasters for Sustainable Development: Visakhapatnam Declaration and Plan of Action Building Resilience to Disasters for Sustainable Development: Visakhapatnam Declaration and Plan of Action Adopted at the Third World Congress on Disaster Management Visakhapatnam, Andhra Pradesh, India

More information

Building a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners

Building a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners Building a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners 24th Forum ORAP Cite Scientifique; Lille, France March 26, 2009 Don Middleton National

More information

The Role of Repositories and Journals in the Astronomy Research Lifecycle

The Role of Repositories and Journals in the Astronomy Research Lifecycle The Role of Repositories and Journals in the Astronomy Research Lifecycle Alberto Accomazzi NASA Astrophysics Data System Smithsonian Astrophysical Observatory http://ads.harvard.edu Astroinformatics 2010,

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

Jeffery S. Horsburgh. Utah Water Research Laboratory Utah State University

Jeffery S. Horsburgh. Utah Water Research Laboratory Utah State University Advancing a Services Oriented Architecture for Sharing Hydrologic Data Jeffery S. Horsburgh Utah Water Research Laboratory Utah State University D.G. Tarboton, D.R. Maidment, I. Zaslavsky, D.P. Ames, J.L.

More information

Rutgers Discovery Informatics Institute (RDI2)

Rutgers Discovery Informatics Institute (RDI2) Rutgers Discovery Informatics Institute (RDI2) Manish Parashar h+p://rdi2.rutgers.edu Modern Science & Society Transformed by Compute & Data The era of Extreme Compute and Big Data New paradigms and prac3ces

More information

Edinburgh DataShare: Tackling research data in a DSpace institutional repository

Edinburgh DataShare: Tackling research data in a DSpace institutional repository Edinburgh DataShare: Tackling research data in a DSpace institutional repository Robin Rice EDINA and Data Library, Information Services University of Edinburgh, Scotland DSpace User Group Meeting Gothenburg,

More information

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal Heinrich Widmann, DKRZ DI4R 2016, Krakow, 28 September 2016 www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020

More information

Paving the Rocky Road Toward Open and FAIR in the Field Sciences

Paving the Rocky Road Toward Open and FAIR in the Field Sciences Paving the Rocky Road Toward Open and FAIR Kerstin Lehnert Lamont-Doherty Earth Observatory, Columbia University IEDA (Interdisciplinary Earth Data Alliance), www.iedadata.org IGSN e.v., www.igsn.org Field

More information

High Performance Computing Course Notes Grid Computing I

High Performance Computing Course Notes Grid Computing I High Performance Computing Course Notes 2008-2009 2009 Grid Computing I Resource Demands Even as computer power, data storage, and communication continue to improve exponentially, resource capacities are

More information

Robin Wilson Director. Digital Identifiers Metadata Services

Robin Wilson Director. Digital Identifiers Metadata Services Robin Wilson Director Digital Identifiers Metadata Services Report Digital Object Identifiers for Publishing and the e-learning Community CONTEXT elearning the the Publishing Challenge elearning the the

More information

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS) Institutional Repository using DSpace Yatrik Patel Scientist D (CS) yatrik@inflibnet.ac.in What is Institutional Repository? Institutional repositories [are]... digital collections capturing and preserving

More information

Defense Coastal/Estuarine Research Program (DCERP) SERDP RC DCERP Data Policy Version 2.0

Defense Coastal/Estuarine Research Program (DCERP) SERDP RC DCERP Data Policy Version 2.0 Defense Coastal/Estuarine Research Program (DCERP) SERDP RC-2245 DCERP Data Policy Version 2.0 November 13, 2009 (Version 1) February 2016 (Version 2) Prepared by: RTI International * 3040 Cornwallis Road

More information

Big Data infrastructure and tools in libraries

Big Data infrastructure and tools in libraries Line Pouchard, PhD Purdue University Libraries Research Data Group Big Data infrastructure and tools in libraries 08/10/2016 DATA IN LIBRARIES: THE BIG PICTURE IFLA/ UNIVERSITY OF CHICAGO BIG DATA: A VERY

More information

DEVELOPING, ENABLING, AND SUPPORTING DATA AND REPOSITORY CERTIFICATION

DEVELOPING, ENABLING, AND SUPPORTING DATA AND REPOSITORY CERTIFICATION DEVELOPING, ENABLING, AND SUPPORTING DATA AND REPOSITORY CERTIFICATION Plato Smith, Ph.D., Data Management Librarian DataONE Member Node Special Topics Discussion June 8, 2017, 2pm - 2:30 pm ASSESSING

More information

Putting the Archives to Work: Workflow and Metadata-driven Analysis in LTER Science

Putting the Archives to Work: Workflow and Metadata-driven Analysis in LTER Science Putting the Archives to Work: Workflow and Metadata-driven Analysis in LTER Science Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia Acknowledgements: John Porter (Virginia Coast Reserve

More information

BIG DATA CHALLENGES A NOAA PERSPECTIVE

BIG DATA CHALLENGES A NOAA PERSPECTIVE BIG DATA CHALLENGES A NOAA PERSPECTIVE Dr. Edward J. Kearns NASA Examiner, Science and Space Branch, OMB/EOP and Chief (acting), Remote Sensing and Applications Division National Climatic Data Center National

More information

EUDAT. Towards a pan-european Collaborative Data Infrastructure

EUDAT. Towards a pan-european Collaborative Data Infrastructure EUDAT Towards a pan-european Collaborative Data Infrastructure Damien Lecarpentier CSC-IT Center for Science, Finland CESSDA workshop Tampere, 5 October 2012 EUDAT Towards a pan-european Collaborative

More information

Advancing the fourth paradigm of research: Assimilating repositories into active research phases

Advancing the fourth paradigm of research: Assimilating repositories into active research phases Title Here Advancing the fourth paradigm of research: Assimilating repositories into active research phases Tyler Walters Dean, University Libraries, Virginia Tech SPARC Conference, Kansas City, March

More information

The GISandbox: A Science Gateway For Geospatial Computing. Davide Del Vento, Eric Shook, Andrea Zonca

The GISandbox: A Science Gateway For Geospatial Computing. Davide Del Vento, Eric Shook, Andrea Zonca The GISandbox: A Science Gateway For Geospatial Computing Davide Del Vento, Eric Shook, Andrea Zonca 1 Paleoscape Model and Human Origins Simulate Climate and Vegetation during the Last Glacial Maximum

More information

Developing a Research Data Policy

Developing a Research Data Policy Developing a Research Data Policy Core Elements of the Content of a Research Data Management Policy This document may be useful for defining research data, explaining what RDM is, illustrating workflows,

More information

The Logical Data Store

The Logical Data Store Tenth ECMWF Workshop on Meteorological Operational Systems 14-18 November 2005, Reading The Logical Data Store Bruce Wright, John Ward & Malcolm Field Crown copyright 2005 Page 1 Contents The presentation

More information

DataONE. Promoting Data Stewardship Through Best Practices

DataONE. Promoting Data Stewardship Through Best Practices DataONE Promoting Data Stewardship Through Best Practices Carly Strasser 1,2, Robert Cook 1,3, William Michener 1,4, Amber Budden 1,4, Rebecca Koskela 1,4 1 DataONE 2 University of California Santa Barbara

More information

Midwest Big Data Hub Accelerating the Big Data Innovation Ecosystem

Midwest Big Data Hub Accelerating the Big Data Innovation Ecosystem Ed Seidel PI (Illinois) Beth Plale Co-PI (Indiana) Sarah Nusser Co-PI (Iowa State) Brian Athey Co-PI (Michigan) Josh Riedy Co-PI, (UND) Melissa Cragin ED (Illinois) SEEDCorn: Sustainable Enabling Environment

More information

Medici for Digital Cultural Heritage Libraries. George Tsouloupas, PhD The LinkSCEEM Project

Medici for Digital Cultural Heritage Libraries. George Tsouloupas, PhD The LinkSCEEM Project Medici for Digital Cultural Heritage Libraries George Tsouloupas, PhD The LinkSCEEM Project Overview of Digital Libraries A Digital Library: "An informal definition of a digital library is a managed collection

More information

Computing Accreditation Commission Version 2.0 CRITERIA FOR ACCREDITING COMPUTING PROGRAMS

Computing Accreditation Commission Version 2.0 CRITERIA FOR ACCREDITING COMPUTING PROGRAMS Computing Accreditation Commission Version 2.0 CRITERIA FOR ACCREDITING COMPUTING PROGRAMS Optional for Reviews During the 2018-2019 Accreditation Cycle Mandatory for Reviews During the 2019-2020 Accreditation

More information

LASDA: an archiving system for managing and sharing large scientific data

LASDA: an archiving system for managing and sharing large scientific data LASDA: an archiving system for managing and sharing large scientific data JEONGHOON LEE Korea Institute of Science and Technology Information Scientific Data Strategy Lab. 245 Daehak-ro, Yuseong-gu, Daejeon

More information

Does Research ICT KALRO? Transforming education using ICT

Does Research ICT KALRO? Transforming education using ICT Does Research ICT Matter @ KALRO? What is Our Agenda The Status of Research Productivity and Collaboration of KE Research Institutions Is the research productivity of KARLO visible to the world? Discovery

More information

Response to Industry Canada Consultation Developing a Digital Research Infrastructure Strategy

Response to Industry Canada Consultation Developing a Digital Research Infrastructure Strategy Response to Industry Canada Consultation Developing a Digital Research Infrastructure Strategy September 14, 2015 1 Introduction Thank you for the opportunity to provide input into Industry Canada s consultation

More information

Implementing a Data Quality Strategy to simplify access to data

Implementing a Data Quality Strategy to simplify access to data IN43D-07 AGU Fall Meeting 2016 Implementing a Quality Strategy to simplify access to data Kelsey Druken, Claire Trenham, Ben Evans, Clare Richards, Jingbo Wang, & Lesley Wyborn National Computational Infrastructure,

More information

The Science and Technology Roadmap to Support the Implementation of the Sendai Framework for Disaster Risk Reduction

The Science and Technology Roadmap to Support the Implementation of the Sendai Framework for Disaster Risk Reduction 29 February 2016 The Science and Technology Roadmap to Support the Implementation of the Sendai Framework for Disaster Risk Reduction 2015-2030 The Sendai Framework for Disaster Risk Reduction 2015-2030

More information

The Changing Role of Data Stewardship in Creating Trustworthy, Transdisciplinary High Performance Data Platforms for the Future

The Changing Role of Data Stewardship in Creating Trustworthy, Transdisciplinary High Performance Data Platforms for the Future AGU Fall Meeting 2016 IN31-G The Changing Role of Data Stewardship in Creating Trustworthy, Transdisciplinary High Performance Data Platforms for the Future Clare Richards, Ben Evans, Lesley Wyborn, Jingbo

More information

28 September PI: John Chip Breier, Ph.D. Applied Ocean Physics & Engineering Woods Hole Oceanographic Institution

28 September PI: John Chip Breier, Ph.D. Applied Ocean Physics & Engineering Woods Hole Oceanographic Institution Developing a Particulate Sampling and In Situ Preservation System for High Spatial and Temporal Resolution Studies of Microbial and Biogeochemical Processes 28 September 2010 PI: John Chip Breier, Ph.D.

More information

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation INSPIRE 2010, KRAKOW Dr. Arif Shaon, Dr. Andrew Woolf (e-science, Science and Technology Facilities Council, UK) 3

More information

Data Curation Profile Cornell University, Biophysics

Data Curation Profile Cornell University, Biophysics Data Curation Profile Cornell University, Biophysics Profile Author Dianne Dietrich Author s Institution Cornell University Contact dd388@cornell.edu Researcher(s) Interviewed Withheld Researcher s Institution

More information

The Data Curation Profiles Toolkit: Interview Worksheet

The Data Curation Profiles Toolkit: Interview Worksheet Purdue University Purdue e-pubs Data Curation Profiles Toolkit 11-29-2010 The Data Curation Profiles Toolkit: Interview Worksheet Jake Carlson Purdue University, jakecar@umich.edu Follow this and additional

More information

Data and information sharing WMO global systems

Data and information sharing WMO global systems Data and information sharing WMO global systems Tommaso Abrate Scientific Officer World Meteorological Organization E-mail: tabrate@wmo.int 13 March, 2012 World Hydrological Cycle Observing System (WHYCOS)

More information

Digital The Harold B. Lee Library

Digital The Harold B. Lee Library Digital Preservation @ The Harold B. Lee Library CIMA 23 May 2013 How we got here? 1. Understanding Digital Preservation 2. Search for Content 3. Maintain Optical Disc Storage 4. In House Preservation

More information

Federal STI Managers Group Presentation to Board on Research Data and Information Ellen Herbst Director, NTIS CENDI Chair

Federal STI Managers Group Presentation to Board on Research Data and Information Ellen Herbst Director, NTIS CENDI Chair CENDI Federal STI Managers Group Presentation to Board on Research Data and Information Ellen Herbst Director, NTIS 2008-10 CENDI Chair January 29, 2009 1 What is CENDI? Interagency group of senior federal

More information

The NIH Big Data to Knowledge Initiative: Raising the Prominence of Data

The NIH Big Data to Knowledge Initiative: Raising the Prominence of Data The NIH Big Data to Knowledge Initiative: Raising the Prominence of Data Michael F. Huerta, Ph.D. Associate Director, National Library of Medicine Director, Office of Health Information Programs Development

More information

The Research Data Alliance Creating the culture and technology for an international data infrastructure

The Research Data Alliance Creating the culture and technology for an international data infrastructure The Research Data Alliance Creating the culture and technology for an international data infrastructure Mark A. Parsons Managing Director, RDA/United States Rensselaer Polytechnic Institute!! AGU Town

More information

Sustainable Governance for Long-Term Stewardship of Earth Science Data

Sustainable Governance for Long-Term Stewardship of Earth Science Data Sustainable Governance for Long-Term Stewardship of Earth Science Data Robert R. Downs and Robert S. Chen NASA Socioeconomic Data and Applications Center (SEDAC) Center for International Earth Science

More information

FAIR-aligned Scientific Repositories: Essential Infrastructure for Open and FAIR Data

FAIR-aligned Scientific Repositories: Essential Infrastructure for Open and FAIR Data FAIR-aligned Scientific Repositories: Essential Infrastructure for Open and FAIR Data GeoDaRRs: What is the existing landscape and what gaps exist in that landscape for data producers and users? 7 August

More information

IMPLEMENTING THE WASCAL DATA INFRASTRUCTURE (WADI)

IMPLEMENTING THE WASCAL DATA INFRASTRUCTURE (WADI) IMPLEMENTING THE WASCAL DATA INFRASTRUCTURE (WADI) Ralf Kunkel, Antonio Rogmann, Jürgen Sorg, Huaping Wang Helmholtz Open Science Webinare zu Forschungsdaten, 2015-03- 11 What is WASCAL? West African Science

More information

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan Metadata for Data Discovery: The NERC Data Catalogue Service Steve Donegan Introduction NERC, Science and Data Centres NERC Discovery Metadata The Data Catalogue Service NERC Data Services Case study:

More information

The Common Framework for Earth Observation Data. US Group on Earth Observations Data Management Working Group

The Common Framework for Earth Observation Data. US Group on Earth Observations Data Management Working Group The Common Framework for Earth Observation Data US Group on Earth Observations Data Management Working Group Agenda USGEO and BEDI background Concise summary of recommended CFEOD standards today Full document

More information

EOSC Services & Architecture: the EOSC-hub approach Tiziana Ferrari, Project Coordinator, EGI Founda?on

EOSC Services & Architecture: the EOSC-hub approach Tiziana Ferrari, Project Coordinator, EGI Founda?on EOSC Services & Architecture: the EOSC-hub approach Tiziana Ferrari, Project Coordinator, EGI Founda?on eosc-hub.eu @EOSC_eu EOSC-hub receives funding from the European Union s Horizon 2020 research and

More information

Big Data, Big Compute, Big Interac3on Machines for Future Biology. Rick Stevens. Argonne Na3onal Laboratory The University of Chicago

Big Data, Big Compute, Big Interac3on Machines for Future Biology. Rick Stevens. Argonne Na3onal Laboratory The University of Chicago Assembly Annota3on Modeling Design Big Data, Big Compute, Big Interac3on Machines for Future Biology Rick Stevens stevens@anl.gov Argonne Na3onal Laboratory The University of Chicago There are no solved

More information

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment Paul Watry Univ. of Liverpool, NaCTeM pwatry@liverpool.ac.uk Ray Larson Univ. of California, Berkeley

More information

7 th International Digital Curation Conference December 2011

7 th International Digital Curation Conference December 2011 Golden Trail 1 Golden-Trail: Retrieving the Data History that Matters from a Comprehensive Provenance Repository Practice Paper Paolo Missier, Newcastle University, UK Bertram Ludäscher, Saumen Dey, Michael

More information

e-infrastructures in FP7 INFO DAY - Paris

e-infrastructures in FP7 INFO DAY - Paris e-infrastructures in FP7 INFO DAY - Paris Carlos Morais Pires European Commission DG INFSO GÉANT & e-infrastructure Unit 1 Global challenges with high societal impact Big Science and the role of empowered

More information