NARCCAP: North American Regional Climate Change Assessment Program Seth McGinnis, NCAR mcginnis@ucar.edu
NARCCAP: North American Regional Climate Change Assessment Program Nest highresolution regional climate models (RCMs) inside coarser global models (GCMs) over North America
Experimental Design 25 years Two 30-year runs, current & future NCEP GFDL CGCM3 HADCM CCSM 3 CRCM X -- X -- X ECP2 X X -- X -- HRM3 X X -- X -- MM5I X -- -- X X RCM3 X X X -- -- WRFG X -- X -- X Timeslices X -- -- X 6 RCMs x 4 GCMs + NCEP and Timeslices = 34 runs total
Data Publication Pipeline Transfer Backup QC Format Ancillary data Metadata Data validity Correct errors Archive Publish to portal (Update / Recall)
NARCCAP Program Goals Evaluate model performance and uncertainty Support further dynamical downscaling experiments Generate high-res climate change scenario data for impacts analysis
Supporting Further Downscaling 3-D boundary condition data High spatial & temporal resolution Large data volumes CO Target region WRF model domain TR
Supporting Impacts Users Ecology, biology, adaptation, water mgmt 2-D surface data for a few variables Regional / statistical / distilled Small data volumes Example: # days w/ T max 90 F for Austin, TX?
Data Services Analysis and transformation of data before transfer to end user Reduce the need for large data downloads Improve usability for applications, non-specialists Capture expertise as automated processing
Providing climate model simula2on data to the user community The CESM perspec2ve Gary Strand, NCAR strandwg@ucar.edu
Earth System Grid (ESG) Ini5ally started as a research project to move data from DOE compu5ng centers in ca. 2000 Evolved into a means to provide NCAR climate model data to the user community, star5ng in early 2002 Used by PCMDI for the CMIP3 archive - 2004 onwards Upgraded and updated to the Earth System Grid Federa5on (ESG- F) for CMIP5 In use currently at NCAR for non- MIP- related data
NCAR ESG- CET portal downloads
NCAR flops and bytes, 2000-2030
Workflow 2000-2012 model 5me 1 header field 1 field 2... field n 5me 2 header field 1 field 2... field n... TB scale disk 5me m header field 1 field 2... field n post- processing/analysis field 1 field 2 field n TB scale disk header 5me 1 5me 2... 5me m header 5me 1 5me 2... 5me m header 5me 1 5me 2... 5me m tape archive tape archive publish publish data portal
Current workflow model 5me 1 header field 1 field 2... field n 5me 2 header field 1 field 2... field n... 5me m header field 1 field 2... field n post- processing/analysis field 1 field 2 field n header 5me 1 5me 2... 5me m header 5me 1 5me 2... 5me m header 5me 1 5me 2... 5me m netcdf- 3 netcdf- 4 PB scale disk tape archive publish data portal
Workflow 2014/2015 model field 1 field 2 field n header time 1 time 2... time m header time 1 time 2... time m header time 1 time 2... time m analysis field 1 field 2 field n header time 1 time 2... time m header time 1 time 2... time m header time 1 time 2... time m 10s PB disk tape archive publish data portal
The perspec2ve from CESM Near- term big data projects CESM1- CAM5- BGC ensemble ~70 runs, total ~7,500 model years, ~200[*] TB Last millenium ensemble 26 runs, total ~26,000 model years, ~300[*] TB (Both using newest workflow) Longer- term big data CMIP6 (2016-2017?) Poten5al addi5onal - MIPs Higher resolu5on (1/8 SE atm/lnd, 1/10 ocn/ice)
CESM and the nearish- future Issues Mee5ng user community needs/wants drives all! Modeling and analysis ~concurrently to avoid memory - > disk latency and all the other issues Ongoing updates of workflow Upda5ng CESM data management policy to reflect workflow and other changes Longer- term viability of ESG/ESGF model - downloading PB isn t sustainable - or is it? Must have serious server- side analysis Possibility of rerunning model for addi5onal data
NCAR s Data Archives: The Bigger Picture Eric Nienhouse, NCAR ejn@ucar.edu
Sample of NCAR Data Archives We build data archives and curate data for diverse communi4es. 20K annual users, 300 data providers, 10K collec5ons, 3.5PB, 2PB yearly downloads. ACADIS Advanced Collabora5ve Arc5c Data Informa5on Service ESG- NCAR Earth System Grid at NCAR RDA Research Data Archive NSF Arc5c projects Self publishing tools Many disciplines Highly varied data Long term preserva5on Climate models (CESM) RCMs (NARCCAP) Large data volume Heavily accessed Reanalysis + obs products Subset and re- format svcs ECMWF, ICOADS, JRA- 55 Ac5vely curated
Community Use and Access Data products are growing in popularity among non tradi4onal disciplines Over 2000 users monthly Diverse and growing user base 10X download volume by 2016 Data reduc5on increasingly u5lized Seeking more ways to access data TB/Month 250 200 150 100 50 0 ESG- NCAR and RDA Data Volume Delivered (Average TB / Month) RDA ESG- NCAR Total 2010 2011 2012 2013 2014 (est)
Removing Barriers to Scien5fic Data Use Common Problems: Finding and preparing data for analysis is expensive. Search, download, evaluate, repeat is slow. Scien5fically related data is hard to find. Tools for data evalua5on are lacking in workflows. Human experts cannot scale to meet growing needs.
Removing Barriers to Scien5fic Data Use Impar4ng knowledge to inform data consumers is a growing need. Published Data Analysis Knowledge
Challenges of obtaining data for analysis Big data challenges include increasing efficiency of obtaining data and informa4on Evaluate Published Data Discover Access Analysis Discovery is improving Metadata federa5on Search engines Schema.org Evalua5on & Access is Challenging Download open required Lirle guidance in workflow Human experts fill in gaps
How do we improve the path to analysis? Open services, user experience driven with tools for enabling innova4on Open data helps (services, ease of access) Connect informa5on to data workflows (wikis, experts) Focus on usability with user centered, itera5ve design Increase access to informa5on throughout access workflow Enable services for data reduc5on and server side analysis Enable third party innova5on with open service access Build in metrics to measure and guide improvements
What the future holds Collabora4on, focus on data use and new communi4es of users Recogni5on that significant barriers to use s5ll exist. Expand collabora5on and technology sharing. User centered design which includes emerging user classes. Workflows suppor5ng efficient path to analysis. More access to expert informa5on and guidance.