Geospatial Access and Data Display Adds Value to Data Management at the Biological and Chemical Oceanographic Data Management Office

Similar documents
DATA ACCESS TUTORIAL 2013 OCB PI Summer Workshop

Putting Your Data on the Map

U.S. GEOTRACES PACIFIC TRANSECT DATA MANAGEMENT

Current JGOFS DMTT activities, and data management requirements for future marine biogeochemical projects - insights for modelers

Interoperability ~ An Introduction

What is ODV? Latest version: ver (updated Jan 9, 2017) Currently >50,000 users, ~20 new users every day!!

J1.6 MONITORING AND ANALYZING THE GLOBAL OCEAN OBSERVING SYSTEM WITH THE OBSERVING SYSTEM MONITORING CENTER

What is ODV? ver h.p://odv.awi.de/ Currently 46,300 registered users, ~20 new users every day!!

Ontology and Application for Reusable Search Interface Design

PART I: Collecting data from National Earth Observatory

Judith Connor Monterey Bay Aquarium Research Institute

Version 3 Updated: 10 March Distributed Oceanographic Match-up Service (DOMS) User Interface Design

NSF Proposals and the Two Page Data Management Plan. 14 January 2011 Cyndy Chandler

Introduction to Data Management for Ocean Science Research

Interactive comment on Data compilation on the biological response to ocean acidification: an update by Y. Yang et al.

DEVELOPMENT OF A DATA MANAGEMENT SYSTEM FOR LONG-TERM ECOLOGICAL RESEARCH

SEXTANT 1. Purpose of the Application

Field Verification of Oil Spill Fate & Transport Modeling and Linking CODAR Observation System Data with SIMAP Predictions

DATA-SHARING PLAN FOR MOORE FOUNDATION Coral resilience investigated in the field and via a sea anemone model system

Online intercomparison of models and observations using OGC and community standards

Data Objectives. The same process can be applied to determine how much simplification is appropriate when describing a geochemical system.

MATHEMATICS CONCEPTS TAUGHT IN THE SCIENCE EXPLORER, FOCUS ON EARTH SCIENCE TEXTBOOK

Global couplled ocean-atmosphere reanalysis and forecast with pelagic ecosystem

JCOMM Observing Programme Support Centre

PART I: Collecting data from National Earth Observations

Project assignment: Particle-based simulation of transport by ocean currents

Ocean Data View (ODV) Manual V1.0

DIVER and ERMA: Data and Visualization

Lecture 1a Overview of Radiometry

Exploring Ocean Data

Introduction of SeaDataNet and EMODNET Working towards a harmonised data infrastructure for marine data. Peter Thijsse MARIS

Report to Plenary on item 3.1

Worldwide Ocean Optics Database (WOOD)

User Guide. Basic Search Tips

29 February Morris Brill Fritz VanWijngaarden

SeaDataNet, Pan-European infrastructure for marine and ocean data management + EMODNET

INSPIRE Conference 2018 (18 21 September, Antwerp)

Quick Start Guide for the ABE Program

Interoperability in Science Data: Stories from the Trenches

Sharing geographic data across the GEF IW Portfolio: IW:LEARN Web-based GIS

EcoGEnIE: A practical course in global ocean ecosystem modelling

ARCTIC BASEMAPS IN GOOGLE MAPS

PROUDMAN OCEANOGRAPHIC LABORATORY CRUISE REPORT NO. 38. Inverted Echo Sounders in the Denmark Strait. As part of FS METEOR CRUISE 50/3

West Coast Observation Project. West Coast Observing System Project Brief

Collecting data types from the thermodynamic data and registering those at the DCO data portal.

INSPIRE and SPIRES Log File Analysis

Automated Data Quality Assurance for Marine Observations

Rolling Deck to Repository: Opportunities for US-EU Collaboration

Developing a Free and Open Source Software based Spatial Data Infrastructure. Jeroen Ticheler

Collecting data types from the thermodynamic data and registering those at the DCO data portal.

MANAGEMENT OF IPY 2007/08 NATIONAL DATA

Data Management Plan: NANOOS Washington Coast Moorings, Northwest Environmental Moorings Lab, University of Washington

Programming with the Semantic Web. Adam Shepherd C. Chandler, R. Arko, D. Fils, D. Kinkade, M. Jones

Towards a Canadian Integrated Ocean Observing System

European Marine Data Exchange

28 September PI: John Chip Breier, Ph.D. Applied Ocean Physics & Engineering Woods Hole Oceanographic Institution

This user guide covers select features of the desktop site. These include:

Successfully Filing Broadband Deployment Data with USAC

Site# Date H20 Temperature Conductance Turbidity KRS Sep KRS Aug KRS Aug

SERVO - ACES Abstract

OILMAP RELEASE NOTES. Version: 7.0 (July 2016) rpsgroup.com

Earth Observation Services in Collaborative Platforms

DataONE: Open Persistent Access to Earth Observational Data

A generic approach to manage metadata standards

Data discovery and access via the SeaDataNet CDI system

Appendix E. Development of methodologies for Brightness Temperature evaluation for the MetOp-SG MWI radiometer. Alberto Franzoso (CGS, Italy)

Adoption of Data Citation Outcomes by BCO-DMO


CREATING VIRTUAL SEMANTIC GRAPHS ON TOP OF BIG DATA FROM SPACE. Konstantina Bereta and Manolis Koubarakis

BODY / SYSTEM Specifically, the website consists of the following pages:

Novel System Architectures for Semantic Based Sensor Networks Integraion

DIAS_Satellite_MODIS_SurfaceReflectance dataset

ASG WHITE PAPER DATA INTELLIGENCE. ASG s Enterprise Data Intelligence Solutions: Data Lineage Diving Deeper

Benefits of CORDA platform features

Using ESML in a Semantic Web Approach for Improved Earth Science Data Usability

James W. Markham Davidson Library University of California, Santa Barbara Santa Barbara, California

OPeNDAP: Accessing HYCOM (and other data) remotely

Exxon Vuldez Oil Spill Restoration Project Annual Report

Kentucky River Watershed Watch Data Portal Workshop

Matlab stoqstoolbox for accessing in situ measurements from STOQS

Windsor Essex Environmental Metadata System (WEEMS)

Steps towards a Web Data Laboratory: data analysis for the 21 st Century

DATABASE ON SALINITY PATTERNS IN FLORIDA BAY *

Very Large Dataset Access and Manipulation: Active Data Repository (ADR) and DataCutter

Introduction to the Google Earth Engine Workshop

Global_Price_Assessments

The Study and Implementation of Extraction HY-1B Level 1B Product Image Data Based on HDF Format Shibin Liu a, Wei Liu ab, Hailong Peng c

Catalog-driven, Reproducible Workflows for Ocean Science

Global Earth Observation System of Systems (GEOSS)/ Asian Water Cycle Initiative. (AWCI) Vietnam Huong Basin data

Desarrollo de una herramienta de visualización de datos oceanográficos: Modelos y Observaciones

20 April Re: South Florida Coastal Water Quality Monitoring Network Oct. Dec Quarterly Report for SFWMD Contract #

REDUCING INFORMATION OVERLOAD IN LARGE SEISMIC DATA SETS. Jeff Hampton, Chris Young, John Merchant, Dorthe Carr and Julio Aguilar-Chang 1

Efficient, Scalable, and Provenance-Aware Management of Linked Data

Web Services USGS/EPA Collaboration

Title Vega: A Flexible Data Model for Environmental Time Series Data

Performance Tools and Holistic HPC Workflows

From data source to data view: A practical guide to uploading spatial data sets into MapX

Well Lifecycle: Workflow Automation

The Butterfly Effect. A proposal for distribution and management for butterfly data programs. Dave Waetjen SESYNC Butterfly Workshop May 10, 2012

Report from the SMHI monitoring cruise with M/V Meri

Transcription:

Geospatial Access and Data Display Adds Value to Data Management at the Biological and Chemical Oceanographic Data Management Office M. Dickson Allison 1 and Charlton Galvarino 2 1 Woods Hole Oceanographic Institution, Woods Hole, MA 2 Second Creek Consulting, LLC, Columbia, SC In 2006, the Biological and Chemical Oceanography Data Management Office (BCO- DMO) was funded to manage all data supported by the Biological and Chemical Oceanography Sections and (later) the Division of Polar Programs Organisms and Ecosystems Section of the U.S. National Science Foundation. BCO- DMO serves a wide variety of oceanographic related data, including but not limited to biological, chemical and physical oceanography measurements. We routinely deal with CTD data, biological abundance, meteorological data, nutrients, ph, carbonate chemistry, PAR, sea surface temperature, heat and momentum flux, sediment composition, trace metals, primary production, and pigment concentration measurements, and with images and movies. In addition to a text- based interface, we offer an OGC- compliant geospatial interface (MapServer 1 ) to enable the user to find the data, evaluate its fitness- for- purpose, and download and re- use the data for his or her own analyses. This task speaks to new data but since the data management office of BCO- DMO was formed in 2006 from the merging of the U.S. JGOFS (1984-2003) and the U.S. GLOBEC (1994-2006) Data Management Offices, there were thousands of legacy datasets to be incorporated into what became the BCO- DMO data management system. Because BCO- DMO is equally devoted to bringing dark data into the light, legacy datasets are still being welcomed. BCO- DMO works directly with Principal Investigators (PIs) to make their data available on the web. There really are no data format restrictions. (This has serious consequences for geospatial access and will be discussed further below.) There are some best practices 2 PIs are encouraged to follow, and which provide their own reward in terms of speed of data availability. BCO- DMO manages data that are heterogeneous in the extreme, and therefore one method of geospatial display does not work for all. Because the data are not always the same kind, format, scale, or magnitude, each quick look at the data using the map must be individually tailored. Figures 1 5 illustrate this point. 1

Figure 1: CARIACO project chemistry time- series data at the site indicated on the map. Individual plots illustrate changes in Nitrate concentration by season. Figure 3. Drifter tracks illustrating particle movement following the Fukushima Earthquake at three different data granularities. Figure 2. Dissolved Organic Matter changes with latitude in the Gulf of Mexico Figure 4. Fluorometry and its relationship to Sea Surface Temperature from two different data sources. 2

Figure 5. Polar projection to illustrate algal community structure in the Ross Sea The challenges of providing different displays for different types of data are many. Each dataset must be described for the mapper. This is done via specific mapping parameters assigned by the data managers and added to the metadata associated with each dataset and stored in a MySQL database. See Figure 6 for a representation of the Metadata Database Schema. (Since we converted from Cold Fusion to Drupal in November 2013, the MySQL database now contains a collection of Drupal tables as the source of the RDF triples. The RDF triples are used in our Beta version of an Advanced Semantically- enabled Faceted Search, a part of which is illustrated in Figure 7.) Essential mapping information like deployment latitude and longitude are in both the metadata and the data themselves (accessed directly) and indeed could be considered both data and metadata (perhaps along with water depth and time). Once we have collected information about the data and entered the metadata into the system we add additional information about the mapping parameters needed for the mapping and display software. Figure 6. MySQL Database Schema emphasizing mapping instruction tables. 3

Figure 7. An example of a portion of the Advanced Search (Beta) display showing a semantically- driven search of programs and projects. The requirements for mapping can then be broken up into two steps: (1) showing the location of the dataset and then (2) displaying the dataset itself using an appropriate graph or plotting format. 1. When a user visits the Mapper, the initial visualization is of many deployments. A deployment is one of: a ship s position during a scientific cruise or cruise track ; a fixed sampling location as from a moored buoy; a position of the laboratory where the results were created; or a satellite coverage bounding box. In order to provide this initial visualization, each deployment must report geographic and temporal information. In the case of a cruise track, a series of [longitude, latitude, and time] triples are reported along with a flag indicating this to be a LINESTRING dataset. This metadata database- reported position information is grouped together into one row in a Shapefile and is accessed by the MapServer software to provide the deployment visualization. 2. Now that the deployments have been visualized, a user will likely interrogate them to explore the datasets collected en route (in the case of a cruise track). From the Mapper s perspective, it is only valuable to show one point at any given location. Having overlapping points for a single dataset is meaningless to a user. 4

a. The Mapper has a finite set of ways it may visualize the data and this is driven by the metadata. Is this dataset a simple point in time and space? Is this dataset a collection of similar data collected at the same point over a period of time? Is this dataset a collection of datasets that vary over time and depth and space? Is granularity important? The answers to these questions determine such things as whether or not a table of data is displayed, a graph is displayed, or a KML file is generated. In the case of KML output, the user is invited to explore the KML data in Google Earth to plunge under the sea to examine specific data points across a wide geographic footprint. b. The point at which data are displayed as a table or presented as a graph marks a departure from the Drupal metadata database. It is at this point that underlying data sources are accessed directly. Since there are no data submission format restrictions (as mentioned earlier in this document), reading the data directly is challenging. For example, determining the time at which a particular row of data was collected varies from dataset to dataset: in one dataset the time and date may be broken into 6 columns while in another, the time is a UTC string in a single column. The next challenge is how do we communicate with the user how to make the quick look displays that we hope will add value to the data? We address the challenge of navigating the mapping options by providing aids in the form of descriptions, tips, and help boxes in as many places as possible, including: Map Option pull- downs, directions in the panel banners, help modules, and consistent use of right click options, among others. However, it has become obvious that the more creativity and software flexibility that is provided to deal with a growing list of data types, the more complex the user experience becomes. We are always asking: Are we giving the user what they really want? Are we giving them all the data about radioactivity levels reaching the West Coast of the US or Deep Water Horizon oil spill hydrocarbons affecting the benthos in the Gulf of Mexico or only what BCO- DMO can make available via area selection ( rubber banding ), searching and downloading? What about Monterey Bay Aquarium Research Institute (MBARI) data or University of Washington data? Can we link to those data and use our own display methodology? And even if we could, would it be at the same scale or precision or quality? There are many reasons for linking geospatial data from multiple sources but there are cautions as well. References: 1 http://www.mapserver.org. Originally developed at the University of Minnesota. 2 BCO- DMO Data Management Best Practices Guide: http://www.bco- dmo.org/data- management- best- practices- guide- 0 5