Ag Data Commons: Harnessing the Power of Digital Agriculture Cynthia Parr USDA ARS National Agricultural Library Live poll at: https://pollev.com/ cyndyparr196
Problems with Public Ag Data Government Website s Ins2tu2ons Food Safety Genomic Geospa2al Repositories Sca$ered Industry Agronomy Diverse Economic Databases Poorly documented Modeling Social Nutri2on Environmental Not interope rable Unusable Non standard
Ag Data Commons https://data.nal.usda.gov/ Self-submission catalog and repository ecosystem USDA research agencies: ONE STOP SHOP Harvesting metadata about USDA-funded data from other repositories A sustainable home for USDA-funded data without a home The Solu)on
Why Ag Data Commons? Federal directives: Public access to open, machine-readable data Agricultural Data (Gather) Agricultural Knowledge (Transform) Agricultural Decision-making and Ac2on (Translate) A platform to harness the power of Digital Agriculture.
The Concept FAIR Data Principles Data should live in community repositories Data owners create the best metadata for a central catalog Metadata bar should be high but not too high Federal Repository (I) Data Producers University Repository (K) Discovery Interface Catalog APIs Computa2onal Tools Data Analy2c Tools Industry Repository (N) Data Consumers NAL curators ensure consistent, high quality Added value Experiment Devices Farm Equipment Ag Data Commons Knowledge Base Publica2ons Patents Grant Info.
Ag Data Commons https://data.nal.usda.gov/ As of today Over 1200 datasets 10% with local resources Nearly 300 registered users Approved for SpringerNature Scientific Data Several harvests including NCBI Bioprojects Links to and from PubAg PubAg https://pubag.nal.usda.gov
Development Landscape scanning Policy development Software customization & upgrades Operations and maintenance Curation workflow Reporting Promotion, training Pending: Storage Major costs By Takkk, CC BY-SA 3.0
Curation What we do Ensure standards-compliant, high quality metadata Add ORCID Researcher Identifiers Add Digital Object Identifiers Annotate with Controlled Vocabularies Assist with organization & linking to literature & code Live poll at: https://pollev.com/ cyndyparr196 What we don t do Quality control data Host executables
Making it all machine readable https://data.nal.usda.gov/ Data dictionary JSON, RDF CSV, API, DB, code
Harvesting metadata Photo: CC BY Tony Walmsley https://flic.kr/p/ws9nec 10
Current funding USDA Agricultural Research Service Big Data Initiative National Agricultural Library USDA NIFA Dairy CAP Reducing costs Distributed curation Automated reporting statistics Automating typical curation tasks Live poll at: https://pollev.com/ cyndyparr196
Reducing Costs: DKAN Science Open Source Drupal-based NAL-CivicActions Collaboration General Framework User Management Roles and Permissions Sta2c Content Content Management Interface Etc. Project Open Data metadata Dataset/resource framework Mechanism for API endpoints Data harves2ng Datastore & Visualiza2on Theme Etc. Analy2cs Node clone Search Scheduler Specialized permissions Virus checker (Files) Housekeeping and dev modules Etc. Research metadata (Cita2ons, ORCIDs, Funding sources, Methods) Addi2onal API endpoints DOI submission Data dic2onary resource uploads Review workflows Etc. Drupal Core DKAN Contributed Modules Research Modules + Minimal Site-specific Customizations
13
What could we do with more funding? Automate metadata & cataloging Automate data quality checks Integrate with HPC platforms Provide general computational tools
Imagery, Scrubbed or Summarized Sensor Data
Future funding sources to explore Submission fees for file deposit Institutional memberships Fees for long term preservation Public-private partnerships
Ag Data Commons https://data.ars.usda.gov Resources Monthly webinars Tutorial videos Ag Data Commons Data submission guide (Look under About ) Contact NAL-ADC-curator@ars.usda.gov Cynthia.Parr@ars.usda.gov Ag Data Commons team: Susan McCarthy, Ursula Pieper, Jason Murray, Erin Antognoli, Jon Sears, Yelena Goryunova, CivicActions