NARCCAP: North American Regional Climate Change Assessment Program. Seth McGinnis, NCAR

Similar documents
Distilling Regional Climate Model Data from NARCCAP for Use in Impacts Analysis

Variability in Annual Temperature Profiles

Building a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners

The Regional Climate Model Evalua4on System (RCMES): Introduc4on and Demonstra4on

Challenges and Solutions for Future Modeling Data Analysis Systems

ExArch, Edinburgh, March 2014

ExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P.

Data Management Components for a Research Data Archive

DataONE Cyberinfrastructure. Ma# Jones Dave Vieglais Bruce Wilson

Towards a Strategy for Data Sciences at UW

CESM Workflow Refactor Project Land Model and Biogeochemistry Working Groups 2015 Winter Meeting CSEG & ASAP/CISL

The CEDA Archive: Data, Services and Infrastructure

When the Need for an Ins/tu/onal Repository Gives Rise to a Federa/on

The Future of ESGF. in the context of ENES Strategy

EOSC Services & Architecture: the EOSC-hub approach Tiziana Ferrari, Project Coordinator, EGI Founda?on

Metadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE

CORDEX DOMAINS (plus Arctic & Antarctica)

Orientation to NCAR, CISL and the Outreach Services Group

Decentralized K-Means Clustering with Emergent Computing

CMIP5 Update. Karl E. Taylor. Program for Climate Model Diagnosis and Intercomparison (PCMDI) Lawrence Livermore National Laboratory

Intro to CMIP, the WHOI CMIP5 community server, and planning for CMIP6

Index Introduction Setting up an account Searching and accessing Download Advanced features

Distributed Online Data Access and Analysis

Introduction to PRECIS

Scientific Applications of the Regional Climate Model Evaluation System (RCMES)

Top Trends in DBMS & DW

InfraStructure for the European Network for Earth System modelling. From «IS-ENES» to IS-ENES2

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography

The Blue Water s File/Archive System. Data Management Challenges Michelle Butler

Data Reference Syntax Governing Standards within Climate Research Data archived in the ESGF

ICOADS: Update Status and Data Distribution

Combinatorial Mathema/cs and Algorithms at Exascale: Challenges and Promising Direc/ons

Why Spectrum Storage Suite and Flash Systems for storage makes perfect sense

The Earth System Grid: A Visualisation Solution. Gary Strand

NCAR Globally Accessible Data Environment (GLADE) Updated: 15 Feb 2017

The Earth System Modeling Framework (and Beyond)

CLOUD SERVICES. Cloud Value Assessment.

Welcome to the CyVerse Data Store. Manage and share your data across all CyVerse pla8orms

NCEP HPC Transition. 15 th ECMWF Workshop on the Use of HPC in Meteorology. Allan Darling. Deputy Director, NCEP Central Operations

Data Issues for next generation HPC

Digital Preservation at NARA

Spatial Distributions of Precipitation Events from Regional Climate Models

Cumulus Services Working Group. Dan Pilone SE TIM / August 2017

Real- &me Archiving of Spontaneous Events (Use- Case : Hurricane Sandy)

Woodson Research Center Digital Preservation Policy

PetaSTAR A Real World Data Storage and Management Solution

NCAR Workload Analysis on Yellowstone. March 2015 V5.0

Overview of XSEDE for HPC Users Victor Hazlewood XSEDE Deputy Director of Operations

Version 3 Updated: 10 March Distributed Oceanographic Match-up Service (DOMS) User Interface Design

CMIP5 Datenmanagement erste Erfahrungen

Design patterns for data-driven research acceleration

First experiences of using WC(P)S at ECMWF

Commi&ng to Data Quality

A PSyclone perspec.ve of the big picture. Rupert Ford STFC Hartree Centre

Clare Richards, Benjamin Evans, Kate Snow, Chris Allen, Jingbo Wang, Kelsey A Druken, Sean Pringle, Jon Smillie and Matt Nethery. nci.org.

CoG: The NEW ESGF WEB USER INTERFACE

Digital Cura+on Planning at Michigan State University

REsources linkage for E-scIence - RENKEI -

Big Data, Big Compute, Big Interac3on Machines for Future Biology. Rick Stevens. Argonne Na3onal Laboratory The University of Chicago

On the Road to 2020 Census Geographic Programs Update Oregon State Data Center Annual Meeting 11/29/2012

Introduction to Grid Computing

globus online Software-as-a-Service for Research Data Management

Machine Learning Crash Course: Part I

CESM2 Software Update. Mariana Vertenstein CESM Software Engineering Group

The C3S Climate Data Store and its upcoming use by CAMS

escience in the Cloud Dan Fay Director Earth, Energy and Environment

Outline. In Situ Data Triage and Visualiza8on

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016

TPP On The Cloud. Joe Slagel

Lightweight Streaming-based Runtime for Cloud Computing. Shrideep Pallickara. Community Grids Lab, Indiana University

Metadata Models for Experimental Science Data Management

Collabora've Development

Peraso Corporate Presenta5on CES 2017

Op#mizing MapReduce for Highly- Distributed Environments

NCAR Workload Analysis on Yellowstone. September 2014 V4.1

Data Curation Practices at the Oak Ridge National Laboratory Distributed Active Archive Center

Data Curation Handbook Steps

Next Generation Backup: Better ways to deal with rapid data growth and aging tape infrastructures

DataONE: Open Persistent Access to Earth Observational Data

Systems 2020 Strategic Initiative Overview

Memory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)

NetSecOps: Policy-Driven, Knowledge-Centric, Holis<c Network Security Opera<ons

CMIP5 early assessment. WGCM Ron Stouffer with lots of help October 2012

Uniform Resource Locator Wide Area Network World Climate Research Programme Coupled Model Intercomparison

OPTIMIZATION MAXIMIZING TELECOM AND NETWORK. The current state of enterprise optimization, best practices and considerations for improvement

ECE468 Computer Organization and Architecture. Memory Hierarchy

Bradley J. Daigle, University of Virginia Sco9 Turnbull, University of Virginia

Data, Data, Everywhere. We are now in the Big Data Era.

The Common Framework for Earth Observation Data. US Group on Earth Observations Data Management Working Group

CISL Update. 29 April Operations and Services Division

Lec 13: Linking and Memory. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements

Current Progress of Grid Project in KMA

Introduction to High Performance Parallel I/O

Algorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory I

Case Study: CyberSKA - A Collaborative Platform for Data Intensive Radio Astronomy

The NCAR Community Data Portal

EGI: Linking digital resources across Eastern Europe for European science and innovation

BIG DATA CHALLENGES A NOAA PERSPECTIVE

CERA: Database System and Data Model

Diagnostics and Exploratory Analysis Infrastructure for ACME Workflow

Transcription:

NARCCAP: North American Regional Climate Change Assessment Program Seth McGinnis, NCAR mcginnis@ucar.edu

NARCCAP: North American Regional Climate Change Assessment Program Nest highresolution regional climate models (RCMs) inside coarser global models (GCMs) over North America

Experimental Design 25 years Two 30-year runs, current & future NCEP GFDL CGCM3 HADCM CCSM 3 CRCM X -- X -- X ECP2 X X -- X -- HRM3 X X -- X -- MM5I X -- -- X X RCM3 X X X -- -- WRFG X -- X -- X Timeslices X -- -- X 6 RCMs x 4 GCMs + NCEP and Timeslices = 34 runs total

Data Publication Pipeline Transfer Backup QC Format Ancillary data Metadata Data validity Correct errors Archive Publish to portal (Update / Recall)

NARCCAP Program Goals Evaluate model performance and uncertainty Support further dynamical downscaling experiments Generate high-res climate change scenario data for impacts analysis

Supporting Further Downscaling 3-D boundary condition data High spatial & temporal resolution Large data volumes CO Target region WRF model domain TR

Supporting Impacts Users Ecology, biology, adaptation, water mgmt 2-D surface data for a few variables Regional / statistical / distilled Small data volumes Example: # days w/ T max 90 F for Austin, TX?

Data Services Analysis and transformation of data before transfer to end user Reduce the need for large data downloads Improve usability for applications, non-specialists Capture expertise as automated processing

Providing climate model simula2on data to the user community The CESM perspec2ve Gary Strand, NCAR strandwg@ucar.edu

Earth System Grid (ESG) Ini5ally started as a research project to move data from DOE compu5ng centers in ca. 2000 Evolved into a means to provide NCAR climate model data to the user community, star5ng in early 2002 Used by PCMDI for the CMIP3 archive - 2004 onwards Upgraded and updated to the Earth System Grid Federa5on (ESG- F) for CMIP5 In use currently at NCAR for non- MIP- related data

NCAR ESG- CET portal downloads

NCAR flops and bytes, 2000-2030

Workflow 2000-2012 model 5me 1 header field 1 field 2... field n 5me 2 header field 1 field 2... field n... TB scale disk 5me m header field 1 field 2... field n post- processing/analysis field 1 field 2 field n TB scale disk header 5me 1 5me 2... 5me m header 5me 1 5me 2... 5me m header 5me 1 5me 2... 5me m tape archive tape archive publish publish data portal

Current workflow model 5me 1 header field 1 field 2... field n 5me 2 header field 1 field 2... field n... 5me m header field 1 field 2... field n post- processing/analysis field 1 field 2 field n header 5me 1 5me 2... 5me m header 5me 1 5me 2... 5me m header 5me 1 5me 2... 5me m netcdf- 3 netcdf- 4 PB scale disk tape archive publish data portal

Workflow 2014/2015 model field 1 field 2 field n header time 1 time 2... time m header time 1 time 2... time m header time 1 time 2... time m analysis field 1 field 2 field n header time 1 time 2... time m header time 1 time 2... time m header time 1 time 2... time m 10s PB disk tape archive publish data portal

The perspec2ve from CESM Near- term big data projects CESM1- CAM5- BGC ensemble ~70 runs, total ~7,500 model years, ~200[*] TB Last millenium ensemble 26 runs, total ~26,000 model years, ~300[*] TB (Both using newest workflow) Longer- term big data CMIP6 (2016-2017?) Poten5al addi5onal - MIPs Higher resolu5on (1/8 SE atm/lnd, 1/10 ocn/ice)

CESM and the nearish- future Issues Mee5ng user community needs/wants drives all! Modeling and analysis ~concurrently to avoid memory - > disk latency and all the other issues Ongoing updates of workflow Upda5ng CESM data management policy to reflect workflow and other changes Longer- term viability of ESG/ESGF model - downloading PB isn t sustainable - or is it? Must have serious server- side analysis Possibility of rerunning model for addi5onal data

NCAR s Data Archives: The Bigger Picture Eric Nienhouse, NCAR ejn@ucar.edu

Sample of NCAR Data Archives We build data archives and curate data for diverse communi4es. 20K annual users, 300 data providers, 10K collec5ons, 3.5PB, 2PB yearly downloads. ACADIS Advanced Collabora5ve Arc5c Data Informa5on Service ESG- NCAR Earth System Grid at NCAR RDA Research Data Archive NSF Arc5c projects Self publishing tools Many disciplines Highly varied data Long term preserva5on Climate models (CESM) RCMs (NARCCAP) Large data volume Heavily accessed Reanalysis + obs products Subset and re- format svcs ECMWF, ICOADS, JRA- 55 Ac5vely curated

Community Use and Access Data products are growing in popularity among non tradi4onal disciplines Over 2000 users monthly Diverse and growing user base 10X download volume by 2016 Data reduc5on increasingly u5lized Seeking more ways to access data TB/Month 250 200 150 100 50 0 ESG- NCAR and RDA Data Volume Delivered (Average TB / Month) RDA ESG- NCAR Total 2010 2011 2012 2013 2014 (est)

Removing Barriers to Scien5fic Data Use Common Problems: Finding and preparing data for analysis is expensive. Search, download, evaluate, repeat is slow. Scien5fically related data is hard to find. Tools for data evalua5on are lacking in workflows. Human experts cannot scale to meet growing needs.

Removing Barriers to Scien5fic Data Use Impar4ng knowledge to inform data consumers is a growing need. Published Data Analysis Knowledge

Challenges of obtaining data for analysis Big data challenges include increasing efficiency of obtaining data and informa4on Evaluate Published Data Discover Access Analysis Discovery is improving Metadata federa5on Search engines Schema.org Evalua5on & Access is Challenging Download open required Lirle guidance in workflow Human experts fill in gaps

How do we improve the path to analysis? Open services, user experience driven with tools for enabling innova4on Open data helps (services, ease of access) Connect informa5on to data workflows (wikis, experts) Focus on usability with user centered, itera5ve design Increase access to informa5on throughout access workflow Enable services for data reduc5on and server side analysis Enable third party innova5on with open service access Build in metrics to measure and guide improvements

What the future holds Collabora4on, focus on data use and new communi4es of users Recogni5on that significant barriers to use s5ll exist. Expand collabora5on and technology sharing. User centered design which includes emerging user classes. Workflows suppor5ng efficient path to analysis. More access to expert informa5on and guidance.