Unidata and data-proximate analysis and visualization in the cloud

Similar documents
Director s Report. Unidata Strategic Advisory Committee Meeting. 20 May 2014 San Francisco, CO

NetCDF and HDF5. NASA Earth Science Data Systems Working Group October 20, 2010 New Orleans. Ed Hartnett, Unidata/UCAR, 2010

RAMADDA and THREDDS. Projects. Tom Yoksas, John Caron, Ethan Davis 1 Jeff McWhirter 2, Don Murray 3 Matthew Lazzara 4. Unidata Program Center/UCAR 2

The Integrated Data Viewer A web-enabled enabled tool for geoscientific analysis and visualization

Toward the Development of a Comprehensive Data & Information Management System for THORPEX

The Many Facets of THREDDS Thematic Real-time Environmental Distributed Data Services

Steve Ansari *, Stephen Del Greco, Neal Lott NOAA National Climatic Data Center, Asheville, North Carolina 2. DATA

McIDAS-V - A powerful data analysis and visualization tool for multi and hyperspectral environmental satellite data *

Python: Its Past, Present, and Future in Meteorology

The Integrated Data Viewer A Tool for Scientific Analysis and Visualization

Uniform Resource Locator Wide Area Network World Climate Research Programme Coupled Model Intercomparison

Tom Achtor, Tom Rink, Tom Whittaker. Space Science & Engineering Center (SSEC) at the University of Wisconsin - Madison

Shaping of Public Environmental Policy: User Community Impact. Samuel P. Williamson Federal Coordinator for Meteorology

BIG DATA CHALLENGES A NOAA PERSPECTIVE

Pangeo. A community-driven effort for Big Data geoscience

The GISandbox: A Science Gateway For Geospatial Computing. Davide Del Vento, Eric Shook, Andrea Zonca

Adding Cloud Based Interactive Compute Capabilities to Globus Endpoints

CONDUIT and NAWIPS Migration to AWIPS II Status Updates Unidata Users Committee. NCEP Central Operations April 18, 2013

Introduction to Cloudbreak

Tools for Handling Big Data and Compu5ng Demands. Humani5es and Social Science Scholars

Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure

IBM dashdb Local. Using a software-defined environment in a private cloud to enable hybrid data warehousing. Evolving the data warehouse

NetCDF-4: : Software Implementing an Enhanced Data Model for the Geosciences

ArcGIS Enterprise: An Introduction. Philip Heede

Open Software Standards for Next- Generation Community Satellite Software Packages June 2017

Amara's law : Overestimating the effects of a technology in the short run and underestimating the effects in the long run

McIDAS-V Tutorial Installation and Introduction updated September 2015 (software version 1.5)

Project Summary: Project Description:

Cisco Cloud Application Centric Infrastructure

CF-netCDF and CDM. Ethan Davis, John Caron, Ben Domenico, Stefano Nativi* UCAR Unidata Univ of Florence*

Introduction to ArcGIS Server Architecture and Services. Amr Wahba

Distributed Online Data Access and Analysis

Making data access easier with OPeNDAP. James Gallapher (OPeNDAP TM ) Duan Beckett (BoM) Kate Snow (NCI) Robert Davy (CSIRO) Adrian Burton (ARDC)

IMAGERY FOR ARCGIS. Manage and Understand Your Imagery. Credit: Image courtesy of DigitalGlobe

HARNESSING THE HYBRID CLOUD TO DRIVE GREATER BUSINESS AGILITY

OPeNDAP: Accessing HYCOM (and other data) remotely

McIDAS-V: An open source data analysis and visualization tool for multiand hyperspectral satellite data ITSC-XVI, Angra do Reis, Brazil, 7 May 2008

Update on NAWIPS/GEMPAK Migration to AWIPS II

ARCHITECTURE OF MADIS DATA PROCESSING AND DISTRIBUTION AT FSL

A ONE-STOP SERVICE HUB INTEGRATING ESSENTIAL WEATHER AND GEOPHYSICAL INFORMATION ON A GIS PLATFORM. Hong Kong Observatory

New Datasets, Functionality and Future Development. Ashwanth Srinivasan, (FSU) Steve Hankin (NOAA/PMEL) Major contributors: Jon Callahan (Mazama(

Moving to the Cloud: Making It Happen With MarkLogic

GSICS Data and Products Server User Guide

Understanding the latent value in all content

Government IT Modernization and the Adoption of Hybrid Cloud

NOAA NextGen IT/Web Services (NGITWS)

Scientific and Multidimensional Raster Support in ArcGIS

CONNECTING THE CLOUD WITH ON DEMAND INFRASTRUCTURE

Yajing (Phillis)Tang. Walt Wells

Catalog-driven, Reproducible Workflows for Ocean Science

Cumulus Services Working Group. Dan Pilone SE TIM / August 2017

Transform Your Business To An Open Hybrid Cloud Architecture. Presenter Name Title Date

Hyper-Converged Infrastructure: Providing New Opportunities for Improved Availability

The Portal Aspect of the LSST Science Platform. Gregory Dubois-Felsmann Caltech/IPAC. LSST2017 August 16, 2017

Accelerate Your Enterprise Private Cloud Initiative

IBM Bluemix compute capabilities IBM Corporation

Data near processing support for climate data analysis. Stephan Kindermann, Carsten Ehbrecht Deutsches Klimarechenzentrum (DKRZ)

Going cloud-native with Kubernetes and Pivotal

Xignite CloudStreaming overview

PUBLIC AND HYBRID CLOUD: BREAKING DOWN BARRIERS

Distributing weather data via multipoint layer-2 paths using DYNES

AWIPS Technology Infusion Darien Davis NOAA/OAR Forecast Systems Laboratory Systems Development Division April 12, 2005

McIDAS Transition Summary Tom Achtor McIDAS Users Group Meeting Madison, WI - 17 October 2007

How to use Water Data to Produce Knowledge: Data Sharing with the CUAHSI Water Data Center

How to Keep UP Through Digital Transformation with Next-Generation App Development

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE

RealEarth. Real-time Visualization of Global Satellite Data and Derived Products

At Course Completion Prepares you as per certification requirements for AWS Developer Associate.

Addressing Geospatial Big Data Management and Distribution Challenges ERDAS APOLLO & ECW

Ocean, Atmosphere & Climate Model Assessment for Everyone

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE

February 21, pm ET

CloudExpo November 2017 Tomer Levi

escience in the Cloud Dan Fay Director Earth, Energy and Environment

ACF Interoperability Human Services 2.0 Overview. August 2011 David Jenkins Administration for Children and Families

Microsoft certified solutions associate

Tom Brenneman. Good morning and welcome, introductions and thank you for being here.

Federated XDMoD Requirements

Matrix IT work Copyright Do not remove source or Attribution from any graphic or portion of graphic

VMWARE CLOUD FOUNDATION: INTEGRATED HYBRID CLOUD PLATFORM WHITE PAPER NOVEMBER 2017

Important DevOps Technologies (3+2+3days) for Deployment

Table of Contents. IDV User Guide. Unidata's Integrated Data Viewer...1/483 How can I get the IDV?...2/ Overview...3/483

ORIGAMI AND GIPS: RUNNING A HYPERSPECTRAL SOUNDER PROCESSING SYSTEM ON A LIGHTWEIGHT ON-DEMAND DISTRIBUTED COMPUTING FRAMEWORK

Extend GIS. The Reach. Of Your GIS. Chris Cappelli Nathan Bennett

Introduction to Cloud Computing

What s New in ArcGIS Server 10

Running MarkLogic in Containers (Both Docker and Kubernetes)

einfrastructures Concertation Event

Cloud Computing: Standardization Challenges EC/ETSI Workshop on Standards in the Cloud Emmanuel Darmois September 28, 2011

Hitachi Enterprise Cloud Container Platform

Part III: Evaluating the Business Value of the Hybrid Cloud

THREE PATHS TO CLOUD CHOICE WITHOUT COMPLEXITY. Copyright 2013 EMC Corporation. All rights reserved.

First Session of the Asia Pacific Information Superhighway Steering Committee, 1 2 November 2017, Dhaka, Bangladesh.

Backtesting in the Cloud

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.

AMS Board on Enterprise Communications Improving Weather Forecasts November 29, Data Sources: Pulling it all together

CHARTING THE FUTURE OF SOFTWARE DEFINED NETWORKING

NetCDF and Scientific Data Durability. Russ Rew, UCAR Unidata ESIP Federation Summer Meeting

Intelligent Enterprise Digital Asset Management

Outline. The Collaborative Research Platform for Data Curation and Repositories: CKAN For ANGIS Data Portal. Open Access & Open Data.

Transcription:

Unidata and data-proximate analysis and visualization in the cloud Mohan Ramamurthy and Many Unidata Staff 1 June 2017 Modeling in the Cloud Workshop

Unidata: A program of the community, by the community, and for the community Established in 1984; Primarily funded by NSF Acquire and distribute real-time meteorological data; Develop software for accessing, managing, analyzing, visualizing geoscience data; Provide training and support to users; Negotiate data & software agreements on behalf of universities; Facilitate advancement of standards and conventions;

Niché and Mission Niché: Providing data services to advance Earth System science research and education. Reduce data friction, lower the barriers for accessing and using data, and shrink the time to science.

A Snapshot of Products & Services Data: Over 30 data streams provided in real-time Data collection, cataloging, and distribution Both push and pull technologies are used User Support & Training: Direct email support Community mailing lists (~60) Annual Training Workshops, Triennial Users Workshops, and Regional Workshops as needed. Software: Data Distribution: LDM Remote Data Access: THREDDS Data Server, ADDE, and RAMADDA Data Management: netcdf, UDUNITS, and Rosetta Analysis and Visualization: GEMPAK, McIDAS, IDV, and AWIPS II GIS support via TDS (WCS, WMS) and KML and Shapefiles Community: Community Engagement; Equipment Awards to universities; Seminars; Advocacy;

Real-time Data Distribution Model Radar Source LDM LDM Source LDM LDM LDM Satellite Source LDM LDM Internet LDM LDM About 30 different streams of real-time weather data from diverse sources are provided to ~1250 computers worldwide. Unidata s outbound traffic out of UCAR network is about 31 Terabytes/day. In fact, we move more data via Internet 2 than any other advanced application.

Remote Data Access Complements the IDD/LDM push data delivery system Available via THREDDS Data Server, RAMADDA, and ADDE data servers that support several protocols and APIs: OPeNDAP ADDE HTTP FTP WCS and WMS The Unidata Program Center operates a data server, that provides the above services. Nearly one terabyte of data are downloaded each day from our servers.

What is our motivation? Data volumes are getting to be too large to bring all of the data to your local environment. Need to keep data close to the point of origin or dissemination and provide the requisite tools and services and create a playground and workbench in the cloud. Bottom line: We need to move from bringing the data to the scientist to bringing the science to the data. We would like to exploit the elasticity and easy virtualization aspects of the cloud. For these and other reasons, Unidata made a decision to transition data services to the cloud about 4 years ago.

Goals for our Cloud work Along with providing data access, develop and provide data-proximate processing, analysis and visualization services that are portable. Provide portable, cloud-compatible software (i.e., Docker containers) that users can run on their own cloud, private or public.

Unidata Cloud Projects

Unidata Cloud Partners

AWIPS Data Servers in the Cloud Unidata is running AWIPS-EDEX data server in the Microsoft Azure cloud and exploring use in the Jetstream Cloud.

AWIPS Data Servers in the Cloud Unidata is running AWIPS-EDEX data server in the Microsoft Azure cloud and exploring use in the Jetstream Cloud. 44 universities are using Unidata s Azurehosted EDEX.

Easing the Community Burden when Deploying Software in the Cloud Deploying services to the cloud/maintaining services in the cloud can be complicated and time consuming.

Easing the Community Burden by Deploying Portable Software in the Cloud Solution: Containerization, e.g. Docker

Virtual Machines vs. Containers Docker Benefits Small Footprint Rapid and Lightweight Deployment Portability Reuse

Containerizing Applications We have created Docker container images for several Unidata applications, including the Integrated Data Viewer (IDV), THREDDS Data Server, Local Data Manager (LDM), and many Python tools. We have been deploying these applications in our own cloud instances and also making them available as downloadable software to our users. We have released a technology stack (dubbed CloudStream) to make it easy to deploy desktop software (as opposed to server software) in the cloud.

CloudIDV The CloudIDV Docker image contains the standard IDV as well as all of the technology required to run it in the cloud, accessed via browser.

CloudIDV

Remote Data Analysis & Visualization In addition to enabling cloud-hosted data access, Unidata is leveraging cloud technologies to enable data proximate analysis and visualization capabilities. Specifically, Unidata is integrating the capabilities of THREDDS Data Server and AWIPS II EDEX Server, Jupyter Notebook platform, Siphon Python data access tool, and MetPy/CartoPy/Matplotlib, IDV and GEMPAK analysis and visualization applications.

TDS+Siphon+Python Plotting Using Siphon to query the NetCDF Subset Service and plotting it to a map

NOAA Big Data Project and Unidata Cloud Activities Unidata is collaborating with Amazon Web services and Open Commons Consortium CRADA Partners.

Collaborative Activities with AWS Streaming real-time NEXRAD radar data to AWS/S3 operationally using the Unidata LDM software. We are continuing our partnership and now moving GOES-16 data. We will next start moving NCEP model output (including the National Water Model Output) to AWS. Running Docker-containerized THREDDS Data Server to serve radar data from AWS/S3. Providing JupyterHub multi-user Python environment, including plotting tools. Providing individual Docker containers. We are continuing our partnership with AWS on the NOAA Big Data Project and on other Unidata efforts, including the provision of GOES-16 data For the first time, users have seamless access to both historical and real-time WSR-88D Radar Data from the same location and interface.

Cloud Partnerships Unidata received a sizeable allocation on the NSF XSEDE Jetstream cloud. We are currently deploying an array of Unidata services in that environment.

Thank You Unidata is one of the University Corporation for Atmospheric Research (UCAR)'s Community Programs (UCP), and is funded primarily by the National Science Foundation (Grant NSF-1344155).