Development and Testing of a Next Generation Spectral Element Model for the US Navy

Similar documents
A Multiscale Non-hydrostatic Atmospheric Model for Regional and Global Applications

Advanced Numerical Methods for NWP Models

HPC Performance Advances for Existing US Navy NWP Systems

Developing NEPTUNE for U.S. Naval Weather Prediction

Towards Exascale Computing with the Atmospheric Model NUMA

Scalable Nonhydrostatic Unified Models of the Atmosphere with Moisture

CESM Projects Using ESMF and NUOPC Conventions

Advanced Numerical Methods for Numerical Weather Prediction

A mass-conservative version of the semi- Lagrangian semi-implicit HIRLAM using Lagrangian vertical coordinates

Advanced Numerical Methods for Numerical Weather Prediction

Next-Generation Global and Mesoscale Atmospheric Models

NPS-NRL-Rice-UIUC Collaboration on Navy Atmosphere-Ocean Coupled Models on Many-Core Computer Architectures Annual Report

Final Report for the NPS-NRL-Rice-UIUC Collaboration on Navy Atmosphere-Ocean Coupled Models on Many-Core Computer Architectures

NOAA-GFDL s new ocean model: MOM6

Extension of NHWAVE to Couple LAMMPS for Modeling Wave Interactions with Arctic Ice Floes

The Earth System Modeling Framework (and Beyond)

A Test Suite for GCMs: An Intercomparison of 11 Dynamical Cores

NPS-NRL-Rice-UIUC Collaboration on Navy Atmosphere-Ocean Coupled Models on Many-Core Computer Architectures Annual Report

The Icosahedral Nonhydrostatic (ICON) Model

Global Numerical Weather Predictions and the semi-lagrangian semi-implicit dynamical 1 / 27. ECMWF forecast model

Performance Optimization Strategies for WRF Physics Schemes Used in Weather Modeling

A Multiscale Nested Modeling Framework to Simulate the Interaction of Surface Gravity Waves with Nonlinear Internal Gravity Waves

Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing

Regional Cooperation for Limited Area Modeling in Central Europe. Dynamics in LACE. Petra Smolíková thanks to many colleagues

Unified Model Performance on the NEC SX-6

A 3-D Finite-Volume Nonhydrostatic Icosahedral Model (NIM) Jin Lee

The ECMWF forecast model, quo vadis?

Deutscher Wetterdienst

ICON Performance Benchmark and Profiling. March 2012

Super-Parameterization of Boundary Layer Roll Vortices in Tropical Cyclone Models

Accelerating Extreme-Scale Numerical Weather Prediction

R&D on dynamical kernels and subkm scale modeling with ALADIN/AROME. 35th EWGLAM and 20th SRNWP Meeting

New Features of HYCOM. Alan J. Wallcraft Naval Research Laboratory. 16th Layered Ocean Model Workshop

A Semi-Lagrangian Discontinuous Galerkin (SLDG) Conservative Transport Scheme on the Cubed-Sphere

Progress on Advanced Dynamical Cores for the Community Atmosphere Model. June 2010

MPAS-O: Plan for 2013

Scalability of Elliptic Solvers in NWP. Weather and Climate- Prediction

Experiences Programming and Optimizing for Knights Landing

Porting The Spectral Element Community Atmosphere Model (CAM-SE) To Hybrid GPU Platforms

The CSLaM transport scheme & new numerical mixing diagnostics

A Spectral Element Eulerian-Lagrangian Atmospheric Model (SEELAM)

Extending scalability of the community atmosphere model

EULAG: high-resolution computational model for research of multi-scale geophysical fluid dynamics

Global Stokes Drift and Climate Wave Modeling

Parallel I/O in the LFRic Infrastructure. Samantha V. Adams Workshop on Exascale I/O for Unstructured Grids th September 2017, DKRZ, Hamburg.

Global non-hydrostatic modelling using Voronoi meshes: The MPAS model

Hybrid Strategies for the NEMO Ocean Model on! Many-core Processors! I. Epicoco, S. Mocavero, G. Aloisio! University of Salento & CMCC, Italy!

Porting and Optimizing the COSMOS coupled model on Power6

Feasibility of virtual machine and cloud computing technologies for high performance computing

The next generation supercomputer. Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency

Determining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace

The Interaction and Merger of Nonlinear Internal Waves (NLIW)

Running the NIM Next-Generation Weather Model on GPUs

Improving climate model coupling through complete mesh representation

MODELED MESOSCALE GRAVITY WAVES: CONTINUOUS SPECTRUM AND ENERGY CASCADE Chungu Lu 1, Steven Koch, Fuqing Zhang, and Ning Wang

First Experiences with Application Development with Fortran Damian Rouson

Introduction to Regional Earth System Model (RegESM)

A Software Developing Environment for Earth System Modeling. Depei Qian Beihang University CScADS Workshop, Snowbird, Utah June 27, 2012

computational Fluid Dynamics - Prof. V. Esfahanian

Experiences with CUDA & OpenACC from porting ACME to GPUs

Computer Architectures and Aspects of NWP models

Evaluating the Performance of the Community Atmosphere Model at High Resolutions

Vertically combined shaved cell method in a z-coordinate nonhydrostatic atmospheric model

Introduction to ECMWF resources:

High performance computing in an operational setting. Johan

NVIDIA Update and Directions on GPU Acceleration for Earth System Models

Overlapping Computation and Communication for Advection on Hybrid Parallel Computers

Multilevel Methods for Forward and Inverse Ice Sheet Modeling

REQUEST FOR A SPECIAL PROJECT

Parameterizing Cloud Layers and their Microphysics

Recent developments in Numerical Methods - with a report from the PDEs on the sphere, April 2014, Boulder

Parallel Adaptive Tsunami Modelling with Triangular Discontinuous Galerkin Schemes

Junhong Wei Goethe University of Frankfurt. Other contributor: Prof. Dr. Ulrich Achatz, Dr. Gergely Bölöni

An Introduction to the LFRic Project

Two dynamical core formulation flaws exposed by a. baroclinic instability test case

ICON for HD(CP) 2. High Definition Clouds and Precipitation for Advancing Climate Prediction

Quantifying the Dynamic Ocean Surface Using Underwater Radiometric Measurement

GPU Acceleration of the Longwave Rapid Radiative Transfer Model in WRF using CUDA Fortran. G. Ruetsch, M. Fatica, E. Phillips, N.

A performance portable implementation of HOMME via the Kokkos programming model

Update on Cray Activities in the Earth Sciences

Mass-flux parameterization in the shallow convection gray zone

SLICE-3D: A three-dimensional conservative semi-lagrangian scheme for transport problems. Mohamed Zerroukat. Met Office, Exeter, UK

Sensitivity of resolved and parameterized surface drag to changes in resolution and parameterization

Evaluating the Performance and Energy Efficiency of the COSMO-ART Model System

Mesoscale Ocean Large Eddy Simulation (MOLES)

FMS: the Flexible Modeling System

High-Resolution Ocean Wave Estimation

Running the FIM and NIM Weather Models on GPUs

Parallel Quality Meshes for Earth Models

Quantifying the Dynamic Ocean Surface Using Underwater Radiometric Measurement

Software Infrastructure for Data Assimilation: Object Oriented Prediction System

Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy

An explicit and conservative remapping strategy for semi-lagrangian advection

ATM 298, Spring 2013 Lecture 4 Numerical Methods: Horizontal DiscreDzaDons April 10, Paul A. Ullrich (HH 251)

Matrix-free multi-gpu Implementation of Elliptic Solvers for strongly anisotropic PDEs

Two strategies for the mitigation of coordinate singularities of a spherical polyhedral grid

Community infrastructure for facilitating improvement and testing of physical parameterizations: the Common Community Physics Package (CCPP)

J. Dennis, National Center for Atmospheric Research, Boulder Colorado, phone: , fax:

The Spherical Harmonics Discrete Ordinate Method for Atmospheric Radiative Transfer

INTAROS Integrated Arctic Observation System

Transcription:

Development and Testing of a Next Generation Spectral Element Model for the US Navy Alex Reinecke 1, Kevin Viner 1, James Doyle 1, Sasa Gabersek 1, Matus Martini 2, John Mickalakes 3, Dave Ryglicki 4, Dave Flagg 1, and Francis Giraldo 5 1 U.S. Naval Research Laboratory, Monterey, USA 2 DEVINE Consulting, Monterey, USA 3 UCAR, Boulder, CO 4 National Research Council, Monterey, USA 5 Naval Postgraduate School, Monterey, USA 17 th Workshop on High Performance Computing 24-28 October 2016 European Center for Medium Range Weather Forecasting Reading, UK

Overview Brief overview of the DoD HPC resources Overview of the NEPTUNE system as the US Navy s next generation modeling system Computational and I/O scaling Summary of work on physic package implementation and optimizations

15 machines spread over 6 HPC centers Total 23 PetaFLOPS of peak computing Over 50 Petabytes of storage DoD HPC Resources

NEPTUNE 4 NRL Marine Meteorology Division and Operational Partner FNMOC Weather and seasonal climate Goal: 5 km deterministic global horizontal resolution Transition ~2023 Global Forecasting: NAVGEM, SL-SI spectral transform 4 times daily, currently 31km resolution Limited Area (Mesoscale) Forecasting: COAMPS, split-explicit FD Over 70 unclassified COAMPS domains run daily down to 500m resolution Domains are always in flux Why invest in a new system? Neither system is positioned to exploit next generation architectures Both systems are low order and nonconservative (though skillful nonetheless) We are not a large laboratory, roughly 100 scientists, less than half working on model development

Next Generation NWP Our goal is to develop next generation model, with capabilities: Multi-scale non-hydrostatic dynamics (global-scale to mesoscale) Highly scalable, flexible, accurate, and conservative numerical methods Structured, unstructured or adaptive grids; scale-aware parameterizations A spectral element (SE) technique is the selected numerical method The numerical solution is represented by a local polynomial expansion Very small communication footprint implies excellent computational scaling NEPTUNE is the candidate with these attributes 1 NEPTUNE: Navy Environmental Prediction system Utilizing the NUMA 2 core 2 NUMA: Nonhydrostatic Unified Model of the Atmosphere (F. Giraldo NPS)

Why Spectral Elements? Small communication footprint Only to transfer 1 row of data, regardless of numerical order Excellent scaling potential Good potential for dense floating point computation Projects well onto new architectures Downsides: Physics implementation on irregular gird doesn t seem to be an issue Tracer transport not ideal new positive definite schemes P0 P1

NEPTUNE Overview and Layout NEPTUNE 7 NEPTUNE is the US NAVY s next generation NWP system NUMA is the dynamical core of NEPTUNE NUMA uses a three-dimensional spectral element technique with a sphere centered Cartesian coordinate system on the cubed sphere Drivers Coupling Ocean, Land, Ice NEPTUNE 1 NUMA 2 Data Assimilation Forecast Model Physics Shared Post Processing Utilities Configuration 1 NEPTUNE: Navy Environmental Prediction system Utilizing the NUMA 2 core 2 NUMA: Nonhydrostatic Unified Model of the Atmosphere (Giraldo et. al. 2013)

8 Dynamical Core Overview Deep atmosphere equations Non-hydrostatic Unified 3D spectral element (CG/DG) dynamical core Q-array stored in (k,i) order, element boundary points share memory Cartesian coordinate system for limited area Sphere centered Cartesian coordinate system for global prediction Inexact integration (mass lumping) 1D/3D IMEX time integrators and split-explicit RK3 integration for efficiency Kelly and Giraldo 2012 JCP, Giraldo et. al J. Sci. Comput. 2013, Kopera and Giraldo 2014 JCP

Global Model Applications <Δx> = 5km global average horizontal resolution Cold start with GFS initial conditions for Hurricane Sandy simulation Realistic rain-band structure in early stages of Sandy, good position forecast at later stages Correct location of precipitation in ITCZ and mid-latitude storm belt

Limited Area Model Application NEPTUNE 10 Developed a limited area version of NEPTUNE with set of projections Lambert Conformal, Mercator, Stereographic Solution prescribed at lateral boundaries using Davies BC s

NEPTUNE Grid Generation Cubed sphere grid with sphere centered Cartesian coordinate system P4est for grid generation and MPI decomposition Parallel and scalable, can build large grids on the fly Limited area domain structured as a cube, called the cube domain Map projections are built into the Jacobian P4est used for parallel grid generation

Cubed Sphere Grid NEPTUNE 12 c d b 0 1 a e e a 2 3 f g f g 4 5 d b c Traditional cube sphere (i.e. FV3, HOMME) stitch 6 cubes together Matching conditions along edges, requires conditional logic for edges NEPTUNE uses a Cartesian, sphere centered coordinate system No matching condition needed, no conditionals for edge conditions

Sphere Centered Coordinate System NEPTUNE 13 z NEPTUNE uses a Cartesian, sphere centered coordinate system No matching condition needed, no conditionals for edge conditions Velocity is defined relative to the coordinate system Semi-analytic Jacobian terms guarantee: Accurate representation of the cubed sphere Smooth transitions across face edges x x = x y z = r cos φ cos λ r cos φ sin λ r sin φ y

Data Layout in NEPTUNE Q-array layout: beneficial for 1D IMEX and physics because vertical varies the fastest HOWEVER There may be significant room for improvement 2D Slice of q array r 7 14 21 28 35 42 49 6 13 20 27 33 41 49 2 4 5 12 19 26 33 40 47 4 11 18 25 32 39 46 3 10 17 24 31 38 45 1 3 For each SE operation data is reordered to contiguous (i,j,k) order on the element Occurs 1x for calls to create_rhs Occurs 2x for calls to create_lap create_lap is called 1x per create_rhs 4 11 18 25 3 10 17 24 1 2 9 16 23 1 8 15 22 2 9 16 23 30 37 44 1 8 15 22 29 36 43 x create_rhs per time step depend on the time integrator: (at least) 1x per stage of a RK integrator

NEPUTNE Scaling Tests DOD HPC CAP Program Capabilities and Acceptance Program 30 days to port and run scaling tests on new systems prior to general use Access to 3 systems in June 2015 CRAY-XC40 and 2 SGI ICE X machines 3D dry dynamical core test, baroclinic instability (Jablonowski 2006) 12-km fixed horizontal resolution -> increase nodes Fixed # of elements per node -> increase nodes

Weak Scaling Test 12-km Baroclinic Wave (Jablonowski 2006) Cray XC-40 (Conrad--NAVO) SGI ICE X (Thunder--AFRL) XC40 code compiled and ran out of the box ICE X required a minor amount of modification to the environment: Set MPI_IB_CONGESTION=1

Strong Scaling Test 12-km Baroclinic Wave (Jablonowski 2006) Cray XC-40 (Conrad--NAVO) SGI ICE X (Thunder--AFRL) Near linear scaling on both machines up to 1 column of elements per compute core Number of cores to reach operational requirement is high Version of code tested was immature, very little effort to optimize Significant room for improvement

I/O Optimization Test problem is ~30 km global resolution with 32 vertical levels (E80P4L32) 14 three-dimensional variables are output 3 times Plan to incorporate asynchronous I/O through ESMF interface (leveraging NAVGEM effort)

Physics Implementation Physics packages implemented through the NUOPC unified physics driver Data order is switched from (k,i) layout to (i,k) layout Special attention is given to the irregular vertical structure associated with vertical spectral elements Testing different mappings to a regular vertical grid Initial full package is GFS hydrostatic physics Working on optimizing this package for KNL through collaboration with the University of Utah

Costal Mesoscale Modeling Next Generation Modeling -- NEPTUNE Development and implementation of physics parametrizations Goal is to run a year worth of cold start simulations with 15 km resolution

Summary NRL is developing NEPTUNE, a threedimensional spectral element NWP system Unify limited area and global modeling Operational transition in the next decade Testing shows excellent strong scaling, however, need to improve overall runtime by optimizing the data layout and incorporated openmp

Thank-you