How to use Water Data to Produce Knowledge: Data Sharing with the CUAHSI Water Data Center

Similar documents
A USER S GUIDE TO REGISTERING AND MAINTAINING DATA SERVICES IN HIS CENTRAL 2.0

Jeffery S. Horsburgh. Utah Water Research Laboratory Utah State University

Data Interoperability in the Hydrologic Sciences

Texas Water Data Services

Hd Hydrologic Information System

Data Archival and Dissemination Tools to Support Your Research, Management, and Education

CUAHSI HYDROLOGIC INFORMATION SYSTEM 2010 STATUS REPORT

Hydrologic Data Discovery, Download, Visualization, and Analysis: A Brief Introduction to HydroDesktop

Open Geospatial Consortium

A relational model for environmental and water resources data

The Common Framework for Earth Observation Data. US Group on Earth Observations Data Management Working Group

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography

Development of a Ground Water Data Portal for Interoperable Data Exchange and Mediation within the NGWMN

Interoperability in Science Data: Stories from the Trenches

Interoperability Between GRDC's Data Holding And The GEOSS Infrastructure

A Web Services Based Water Data Sharing Approach. Using Open Geospatial Standards. Rohit Kumar Khattar

Open-Source Data Catalog for Managing Hydrologic Data. Sarva Teja Pulla

Design and Development of an Integrated Web-based System for USGS and AHPS Data Analysis Using Open Source Cyberinfrastructure. Bryce Wayne Anderson

The MEG Metadata Schemas Registry Schemas and Ontologies: building a Semantic Infrastructure for GRIDs and digital libraries Edinburgh, 16 May 2003

DataONE: Open Persistent Access to Earth Observational Data

CUAHSI HYDROLOGIC INFORMATION SYSTEM: OVERVIEW OF VERSION 1.1

THE VOEIS HIS GATEWAY. A REST Interface for HydroServer using ODM 1.1

Web Based Access to Water Related Data Using OGC WaterML 2.0

CUAHSI HYDROLOGIC INFORMATION SYSTEM: 2009 STATUS REPORT

HYDRODESKTOP VERSION 1.4 QUICK START GUIDE

Motions from the 91st OGC Technical and Planning Committee Meetings Geneva, Switzerland Contents

DEVELOPING, ENABLING, AND SUPPORTING DATA AND REPOSITORY CERTIFICATION

Unifying Hydrological Time Series Data for a Global Water Portal

Integration of a Web Processing Service (WPS), GIS and hydraulic modelling (TELEMAC) for geophysical analysis

Wade Sheldon. Georgia Coastal Ecosystems LTER University of Georgia

CUAHSI ODM Tools Version 1.0 Design Specifications October 27, 2006

Registry Interchange Format: Collections and Services (RIF-CS) explained

The European Commission s science and knowledge service. Joint Research Centre

Paving the Rocky Road Toward Open and FAIR in the Field Sciences

2017 Resource Allocations Competition Results

HYDRODESKTOP VERSION 1.1 BETA QUICK START GUIDE

Title Vega: A Flexible Data Model for Environmental Time Series Data

User Interface Design Considerations for a Time-Space GIS

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

Water and Sediment Quality Status and Trends in the Coastal Bend Area Phase 1: Data Archiving and Publishing

Reducing Consumer Uncertainty

Levels Of Data Interoperability In The Emerging North American Groundwater Data Network

ODM STREAMING DATA LOADER

HIS document 4 Configuring web services for an observations database (version 1.0)

Emergency Response through Collaborative Crisis Management

Data Management at NIST

A Cyber-Infrastructure for a Virtual Observatory and Ecological Informatics System -VOEIS

Geospatial Enterprise Search. June

GeoDCAT-AP Representing geographic metadata by using the "DCAT application profile for data portals in Europe"

ODM 1.1. An application Hydrologic. June Prepared by: Jeffery S. Horsburgh Justin Berger Utah Water

Striving for efficiency

EUDAT. Towards a pan-european Collaborative Data Infrastructure

Next Generation Physical Access Control Systems A Smart Card Alliance Educational Institute Workshop. Scalability: Dimensions for PACS System Growth

Implementing the Army Net Centric Data Strategy in a Service Oriented Environment

Indiana University Research Technology and the Research Data Alliance

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

Interoperability ~ An Introduction

UNIT-II : VIRTUALIZATION & COMMON STANDARDS IN CLOUD COMPUTING

International Oceanographic Data and Information Exchange - Ocean Data Portal (IODE ODP)

Data Curation Practices at the Oak Ridge National Laboratory Distributed Active Archive Center

Brown University Libraries Technology Plan

The Modeling and Simulation Catalog for Discovery, Knowledge, and Reuse

CHAMELEON: A LARGE-SCALE, RECONFIGURABLE EXPERIMENTAL ENVIRONMENT FOR CLOUD RESEARCH

Science-as-a-Service

EARTH OBSERVATION DATA INTEROPERABILITY ARRANGEMENT WITH ONTOLOGY REGISTRY

Toward the Development of a Comprehensive Data & Information Management System for THORPEX

EUDAT & SeaDataCloud

DATA UPLOADING TUTORIAL: CLOUD HYDROSERVER

MAPR DATA GOVERNANCE WITHOUT COMPROMISE

National Materials Data Initiatives

Extend NonStop Applications with Cloud-based Services. Phil Ly, TIC Software John Russell, Canam Software

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

Growing Variety and Volume of Remote Sensing and In Situ Data

A Framework of Information Technology for Water Resources Management

Introduction to Data Management for Ocean Science Research

Managing Hydrologic Data, Observations and Terrain Analysis

Software Architectures for Distributed Environmental Modeling

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017

EUDAT- Towards a Global Collaborative Data Infrastructure

Metadata Management System (MMS)

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment

Distributing LIXI Data as a Newscast"

INSPIRE & Environment Data in the EU

PREPARED BY CROSS-DOMAIN INTEROPERABILITY TEST BED GROUP

Effective: 12/31/17 Last Revised: 8/28/17. Responsible University Administrator: Vice Chancellor for Information Services & CIO

GEO-SPATIAL METADATA SERVICES ISRO S INITIATIVE

National Science and Technology Council. Interagency Working Group on Digital Data

MY DEWETRA IPAFLOODS REPORT

The NIH Big Data to Knowledge Initiative: Raising the Prominence of Data

Hydrologic Data Sharing Using Open Source Software and Low-Cost Electronics

COALITION ON PUBLISHING DATA IN THE EARTH AND SPACE SCIENCES: A MODEL TO ADVANCE LEADING DATA PRACTICES IN SCHOLARLY PUBLISHING. Source: NSF.

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

An Architectural Overview Of HydroShare, A Next-Generation Hydrologic Information System

Unidata and data-proximate analysis and visualization in the cloud

escience in the Cloud Dan Fay Director Earth, Energy and Environment

Data Management Tools. Lizzy Rolando, Georgia Tech Aaron Trehub, Auburn University August 6, 2013

Kaltura Platform: Ultimate Deployment Flexibility

Federal STI Managers Group Presentation to Board on Research Data and Information Ellen Herbst Director, NTIS CENDI Chair

EMC ISILON HARDWARE PLATFORM

Transcription:

How to use Water Data to Produce Knowledge: Data Sharing with the CUAHSI Water Data Center Jon Pollak The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) August 20, 2014 Lower Mekong Initiative International Workshop

Agenda and Goals Background and introduction on CUAHSI The CUAHSI Water Data Center o The Hydrologic Information System Challenges I hope you will come away with a feeling that there are existing tools that can help this region develop cyberinfrastructure for sharing data

CUAHSI Consortium of Universities for the Advancement of Hydrologic Science, Inc. Established in 2001 Over 100 members Funding primarily by the U.S. National Science Foundation Largest program is a Water Data Center (WDC) that focuses on sharing and archiving of academic research data

Mission of CUAHSI WDC Provide data services for the academic community o Data Management/Archive o Data Access To define how to achieve this mission, CUAHSI uses a community governance structure that directs staff: o Board of Directors o Standing Committee on Informatics o WDC Users Committee

Governance of CUAHSI A Board of Directors oversee all activity of the consortium Two committees provide guidance for the Water Data Center

Why Share Data? Data re-use Data validation and verification In the U.S., Data Management Plans are a requirement for those receiving grant money from the National Science Foundation o Most data must eventually become public

Who is using our services and software? U.S. National Science Foundation-funded research o Shale Network à Research Coordination Network evaluating impacts of hydraulic fracturing on water resources o Critical Zone Observatories o Global Lakes Ecological Observatory Network (GLEON) U.S. Local government and watershed associations Italian Environmental Protection Agency (ISPRA)

The Hydrologic Information System (HIS) HydroCatalog: search engine Google: search engine Metadata Harvesting Search Results Metadata Harvesting Search Results HydroServer: data server Water Data HydroDesktop: data explorer Amazon.com: Web server Water Data Internet Explorer: web browser Services- Oriented Architecture A web service is software that transmits data over the internet

Time Series Data A unique time series is defined by: Variable: What is being measured Site: Location where measurement is made Method: How the measurement is made Source: Who made the measurement Quality Control: Has the data been reviewed for errors and/or modified?

Standards Information Model Semantics WaterML

Our Information Model: The Observations Data Model (ODM) Above: Generalization of ODM1

ODM Schema Horsburgh, J. S., D. G. Tarboton, D. R. Maidment, and I. Zaslavsky (2008), A relational model for environmental and water resources data, Water Resour. Res., 44, W05406, doi:10.1029/2007wr006392.

Controlled Vocabularies

Ontology of Variables Terms that are used in keyword search. What is being measured?

WaterML XML for transmitting time series water data Becoming a global standard o Adopted by Open Geospatial Consortium (OGC) o Under consideration by the World Meteorological Organization (WMO) This is the standard that makes the HIS interoperable with other systems! Above: Screenshot of WaterML1

Cloud Data Storage The CUAHSI Water Data Center is a virtual data center We own very little infrastructure relative to the amount of data in our catalog Employing cloud technology o Scalability and reliability o Cost is calculated based on usage o We have chosen to use the Microsoft Azure Cloud

Data Access www.hydrodesktop.org

Data Publication and Archive HydroServer: Our software for publishing water data o Includes: Database, web service, data management tools We have deployed this software to the Azure Cloud

The WDC Provides Adherence to Standards Cloud Storage We are funded to deal with issues of long term data persistence Data Creation Format Data to Comply with ODM Load Data into Database Deploy Web Service and Register with Metadata Catalog Above: Workflow for publishing and archiving data with the WDC.

HIS Central Metadata database and software that facilitates search and discovery Public webpage for each data source Deployed in the Azure Cloud

A Distributed Ecosystem WDC Azure Holdings Data Access Client (e.g. HydroDesktop) API HIS Central Catalog Metadata Harvesting ODM Databases Request: WaterOneFlow Web Services Return: WaterML Data Sources Federal Agency such as NASA or USGS

Open Data Access & Open Source Development All data published with the WDC is openly available All software developed is posted online o www.github.com/cuahsi o www.hydrodesktop.org

Challenges

Data and Metadata Standardization Training researchers to follow best practices o Providing tools to ease this burden Developing and maintaining approaches to crossdomain data discovery such as controlled vocabularies

Continuous Data versus Discrete Data Discrete Continuous Larger set of metadata describing Small amount of metadata describing a single or small set of results many measurements Discrete Monitoring A sample is taken and sent to a lab for further analysis Typically a one- time event that can be repeated as needed Continuous Monitoring A sensor is used to record a continuous stream of data about 1 particular analyte or a small set of analytes (i.e. flow, dissolved oxygen, ph, etc). Values are reported at set intervals (i.e. every 15 minutes, 1 hour, etc.) Slide Credit: Dwayne Young, U.S. EPA

Data Storage and Retrieval What does one need to consider? o High reliability: the data source can t stop working. o High persistence: when you put something there, it should stay there. o High availability: the data source must handle multiple simultaneous uses. o Quick recovery: in the unlikely event of a failure, data must becom available again quickly.

Water Data Presents Unique Challenges 4000+ identifiable kinds of information (and growing). Interdisciplinarity and data reusability: data is used by several distinct disciplines. Dynamic data: much data is real-time. Data use for different purposes: o Scientific inquiry o Decision support o Education and outreach

Some Conclusions There are many existing challenges for sharing water data o Cultural o Technological There are existing (free!) tools and standards that could provide a starting point for sharing data within the Lower Mekong Region

Thank you! CUAHSI Website: www.cuahsi.org My email: jpollak@cuahsi.org CUAHSI WDC Github: www.github.com/cuahsi