Carlos Osuna. Meteoswiss.

Size: px

Start display at page:

Download "Carlos Osuna. Meteoswiss."

Margaret Cooper
5 years ago
Views:

1 Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss DSL Toolchains for Performance Portable Geophysical Fluid Dynamic Models Carlos Osuna Meteoswiss V. Clement, O. Fuhrer, S. Moosbrugger, X. Lapillonne, P. Spoerri, T. Schulthess, F. Thuering, H. Vogt, T. Wicky

2 DSL Toolchains for Performance Portable Geophysical Fluid Dynamic Models COSMO 2km ensembles(21 members) COSMO 1km MPI Seminar, 21st Novemebr

3 DSL Toolchains for Performance Portable Geophysical Fluid Dynamic Models MPI Seminar, 21st Novemebr

4 Top 500 overview Piz daint Titan Sequoia GPU Sunway TaihuLight Disruptvie emergence of accelerators in HPC. Gyoukou Large memory bandwidth have speeded up our models. On NVIDIA GPUs: Tianhe-2 dynamicalcores 2-3x physical Trinity parametrizations 3-7x spectral transforms > 10x Cori K Computer Multi-core oakforest MIC 4 Carlos Osuna, MPI Seminar, 21 November

5 How to achieve portability forourgfd models? Carlos Osuna, MPI Seminar, 21 November

6 DSL Toolchains for Performance Portable Geophysical Fluid Dynamic Models Carlos Osuna, MPI Seminar, 21 November

7 COSMO global 1km Discussion paper under review Carlos Osuna, MPI Seminar, 21 November

8 DSL Toolchains for Performance Portable Geophysical Fluid Dynamic Models Carlos Osuna, MPI Seminar, 21 November

9 Carlos Osuna, MPI Seminar, 21 November

A domain specific language(dsl) for GFDs: abstracts away all unnecessary details of a programming language(fortran) implementation concise syntax: only those keywords needed to express the numerical

10 A domain specific language(dsl) for GFDs: abstracts away all unnecessary details of a programming language(fortran) implementation concise syntax: only those keywords needed to express the numerical problem The GridTools Ecosystem Set of C++ APIs / Libraries DSL for PDEs Large class of problems Performance Portability Separation of concerns Interface to Fortran Open source license Carlos Osuna, MPI Seminar, 21 November

DSL for Weather Codes on Unstructured Grids DSLs are a successful approach to deal with the software productivity gap ESCAPE develops a DSL based on GridTools: library tools for solving PDEs on

11 DSL for Weather Codes on Unstructured Grids DSLs are a successful approach to deal with the software productivity gap ESCAPE develops a DSL based on GridTools: library tools for solving PDEs on multiple grids. The DSL syntax requires only the grid-point operation: ) = -4* u + u(i+1) + u(i-1) + u(j+1) + u(j-1) Tiled loop templates and parallelization are abstracted Use of fast on-chip memory is also abstracted. Grid is abstracted (in combination with Atlas) Carlos Osuna, MPI Seminar, 21 November

12 Writing Operators using DSL Carlos Osuna, MPI Seminar, 21 November

13 Carlos Osuna, MPI Seminar, 21 November

14 Global Grids New DSL constructs for stencils on global grids Expose a unstructured grid API Leverage grid structure for performance Icosahedral Cubed sphere Octahedral Carlos Osuna, MPI Seminar, 21 November

Grid Abstraction Icosahedral Octahedral constepxrauto offsets = connectivity<edges,

Dual-grid Cube sphere eval(flux) = eval(sum_on_vertices(0.

The same used code using DSL can be compiled for multiple Grids.

15 Grid Abstraction Icosahedral Octahedral constepxrauto offsets = connectivity<edges, vertices, Color>::offsets(); for( auto off: offsets) { eval(flux) += eval( pd(off) ); } Dual-grid Cube sphere eval(flux) = eval(sum_on_vertices(0.0, pd)); The DSL syntax elements are grid independent. The same used code using DSL can be compiled for multiple Grids. GridToolswill support efficient native layouts for octahedral/icosahedral. GridToolswill use Atlas in order to support any grid. Carlos Osuna, MPI Seminar, 21 November

16 Example of GridTools DSL : MPDATA Carlos Osuna, MPI Seminar, 21 November

17 Example of GridTools DSL : MPDATA Carlos Osuna, MPI Seminar, 21 November

18 Composition of multiple MPDATA operators m_upwind_fluxes = make_computation<gpu>( domain_uwf, grid_, make_multistage(execute<forward>(), make_stage<upwind_flux, octahedral_topology_t, edges>( p_flux(), p_pd(), p_vn()), make_stage<upwind_fluz, octahedral_topology_t, vertices>( p_fluz(), p_pd(), p_wn()), make_stage<fluxzdiv, octahedral_topology_t, vertices>( p_divvd(), p_flux(), p_fluz(), p_dual_volumes(), p_edges_sign()), Full MPDATA implemented using the GridTools DSL: upwind fluxes minmax limiter compute fluxes Limit fluxes Flux solution Rho correction make_stage<advance_solution, octahedral_topology_t, vertices>( p_pd(), p_divvd(), p_rho()))); Carlos Osuna, MPI Seminar, 21 November

Grid Abstraction: Performance Portability

19 Grid Abstraction: Performance Portability Thanks to abstraction of the mesh we can implement any partitioning: Equal partitioner implemented with Atlas Structured partitioner implemented with Atlas O32 mesh Carlos Osuna, MPI Seminar, 21 November

20 Atlas: a library for NWP and climate modelling Willem Deconinck Carlos Osuna, MPI Seminar, 21 November Deconinck et al

Bandwidth GB/s B SN DA 211 SN IA 270 UN IA

21 Grid Abstraction: Performance Portability Abstraction of the mesh can implement any indexing layout Indexing has a large impact in performance: NVIDIA P100, 128x128x80, Bandwidth GB/s B SN DA 211 SN IA 270 UN IA 130 HN IA 256 Carlos Osuna, MPI Seminar, 21 November

22 DSLs and abstractions: state of the art GridTools and Atlas E solely relies on a C++ compiler No external tools provides a high-level of abstraction Performance portability But writing in this newprogramming modele is hard (C++ expertise) Decreased productivity is unsafe No protection from violating the parallel model often requires expert knowledge to produce efficient code Carlos Osuna, MPI Seminar, 21 November

What is the Future of GFD Models in a complex HPC

Need to extend tools to a large set of models,

23 What is the Future of GFD Models in a complex HPC environment? 1. Need to increase model development productivity 2. Need to decrease maintainence cost of models 3. Need to extend tools to a large set of models, methods, domains. 4. Need to maximize efficiency Carlos Osuna, MPI Seminar, 21 November

24 DSLToolchains for Performance Portable Geophysical Fluid Dynamic Models Carlos Osuna, MPI Seminar, 21 November

25 Need to increase model development productivity High leveldsls, concise, safe and intuitive syntax that remove boiler-plate from embedded GridTools Carlos Osuna, MPI Seminar, 21 November

26 Need to decrease maintainence of models Check for parallel errors, Auto-generation of boundary conditions and halo exchanges Check forout ofbounds Safe generation of loop data dependencies E. Carlos Osuna, MPI Seminar, 21 November

27 Need toextendtoolstoa large setofmodels, methods, domains. Options: ICON-Ocean IFS COSMO ICON-atmos Unifyall toolsin a singledsl softwarestack-> GFD maintainence models issue. LFRic One DSL (programming model) per model, currently leading to explosion of DSL solutions. ESCAPE2 Liszt HybridFortran GridTools DSLs and programming models PsyClone Firedrake CLAW Carlos Osuna, MPI Seminar, 21 November

28 Need toextendtoolstoa large setofmodels, methods, domains. Options: ICON-Ocean IFS Unifyall COSMO ICONtoolsin a singledsl softwarestack-> GFD maintainence models issue. LFRic One DSL (programming model) per model, currently leading to explosion of DSL solutions. ESCAPE2 Liszt HybridFortran GridTools DSLs and programming models PsyClone Firedrake CLAW Carlos Osuna, MPI Seminar, 21 November

29 In the ESCAPE2 DNA3 Develop programming languages for GFD models that can increase productivity, and lower maintainance Standardization, avoid explosion of tools. Support multiple domains(fv weather, FV ocean, structured, unstructured, FE) in the standardization Multiple domain languages or customizations, single tool Language to be defined together with scientists Carlos Osuna, MPI Seminar, 21 November

30 ESCAPE2 base technology Modular compiler toolchain(based on modern compilers llvm-) GTCLANG DAWN Carlos Osuna, MPI Seminar, 21 November

31 ESCAPE2 base technology Multiple variants of a language FE plugin IFS collocated ICON Ocean plugin a[i+1] on_cells(+,a) Unstructured Structured DAWN Carlos Osuna, MPI Seminar, 21 November

32 Example Horizontal Diffusion Carlos Osuna, MPI Seminar, 21 November 2017 Fabian Thüring

33 Example Horizontal Diffusion stencil_function laplacian { storage phi; Do { return 4.0 * phi phi[i+1] phi[i-1] phi[j+1] phi[j-1]; } }; stencil hd_type2 { storage hd, in; temporary_storage lap; Do { vertical_region(k_start, k_end) { lap = laplacian(in); hd = laplacian(lap); } } }; Carlos Osuna, MPI Seminar, 21 November

34 Safety First stencil hd_type2 { storage u, utens; temporary_storage uavg; Do { vertical_region(k_start, k_end) { uavg = (u[i+1]+u[i-1])*0.5; utens = (uavg[k+1]); } } }; Carlos Osuna, MPI Seminar, 21 November

35 Safety First stencil hd_type2 { storage u, utens; temporary_storage uavg; Do { vertical_region(k_start, k_end) { uavg = (u[i+1]+u[i-1])*0.5; utens = (uavg[j+1] + uavg[j-1]); } } }; Carlos Osuna, MPI Seminar, 21 November

36 Safety First stencil hd_type2 { storage u, utens; temporary_storage uavg; Do { vertical_region(k_start, k_end) { uavg = (u[i+1]+u[i-1])*0.5; utens = (uavg[j+1] + uavg[j-1]); } } }; Carlos Osuna, MPI Seminar, 21 November

$Interoperatbility : Escaping the language stencil_function laplacian { storage phi; Do { return 4.$

37 Interoperatbility : Escaping the language stencil_function laplacian { storage phi; Do { return 4.0 * phi phi[i+1] phi[i-1] phi[j+1] phi[j-1]; } }; stencil hd_type2 { storage hd, in; temporary_storage lap; Do { vertical_region(k_start, k_end) { lap = laplacian(in); hd = laplacian(lap); } } }; #omp parallel for nowait #acc for(int i=0; i < isize; ++i) { for(int j=0; j < jsize; ++j) { for(int k=0; k < ksize; ++k) { udiff(i,j,k) = hd(i+1,j,k) + hd(i-1,j,k); } } } DSL code Standard C++ code (+OpenMP, +OpenACC, +CUDA) Carlos Osuna, MPI Seminar, 21 November

38 {«filename»:«/code/access_test.cpp», «stencils»:[ «name»:«hori_diff», «loc»:{ «line»: 24, «column»: 8 }, «ast»:{ «block_stmt»:{ «statements»:{ «expr_stmt»:{ «assignment_expr»:{ «left»:{ «field_access_expr»:{ «name»:«b», «offset»:[], Python } }, GridToolsT4Py «right»:{ «binary_op»:{ «left»:{ «field_access_expr»:{ «name»:«a», «offset»:[1,0,0], } }, «right»:{ «field_access_expr»:{ «name»:«a», «offset»:[-1,0,0], Carlos Osuna, MPI Seminar, 21 November }}}}}}}}}} } Standardization: SIR Multiple DSL (thin) frontends can generate standard SIR scheme C++ GTCLANG Fortran CLAW Fortran PsyClone ESCAPE2 / Ocean b = a(i+1) + a(i-1)

39 DAWN: Compiler optimization passes Carlos Osuna, MPI Seminar, 21 November

40 COSMO dycore evaluation of DSL toolchain Type 2 u Type 2 v Type 2 w Type 2 pp Smagorinsky Horizontal Diffusion (P100) Advection( metricand time-p100 -) Carlos Osuna, MPI Seminar, 21 November

41 Conclusions The future of GFD models, high resolution, multidecade simulation will bring serious computational challenges: efficiency and production cost, maintenance cost Weneedtofind rightprogrammingmodelsso thatadaptationto newarchitecturesdo not hinderscientificdevelopment-> high productivity Single effortsarenot a valid modelanymore, weneed collaborations ESCAPE2 is a great opportunity to define the new language with scientists, we have the technology in place. Carlos Osuna, MPI Seminar, 21 November

42 BACKUPS MPI Seminar, 21st Novemebr 2017

43 MPI Seminar, 21st Novemebr 2017

Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss MeteoSwiss Operation Center 1 CH-8058 Zurich-Airport T +41 58 460 91 11 www.meteoswiss.

44 Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss MeteoSwiss Operation Center 1 CH-8058 Zurich-Airport T MeteoSvizzera Via ai Monti 146 CH-6605 Locarno-Monti T MétéoSuisse 7bis, av. de la Paix CH-1211 Genève 2 T MétéoSuisse Chemin de l Aérologie CH-1530 Payerne T Internal presentation, Valentin Clément 44

Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss. PP POMPA status.

Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss PP POMPA status Xavier Lapillonne Performance On Massively Parallel Architectures Last year of the project