CCSM Performance with the New Coupler, cpl6
|
|
- Mariah Tucker
- 5 years ago
- Views:
Transcription
1 CCSM Performance with the New Coupler, cpl6 Tony Craig Brian Kauffman Tom Bettge National Center for Atmospheric Research Jay Larson Rob Jacob Everest Ong Argonne National Laboratory Chris Ding Helen He Lawrence Berkeley National Laboratory RIST Workshop, March 3-5, 2003, INGV, Rome, Italy
2 Topics CCSM overview cpl5 review cpl6 goals cpl6 design and datatypes cpl6 performance merging mapping communication Summary
3 CCSM Overview CCSM = Community Climate System Model (NCAR) Designed to evaluate and understand earth s global climate, both historical and future. Multiple executables (5) Atmosphere (CAM), MPI/OpenMP Ocean (POP), MPI Land (CLM2), MPI/OpenMP Sea Ice (CSIM4), MPI Coupler (CPL5), OpenMP
4 CCSM2 Hub and Spoke System ocn atm cpl ice lnd Each component is a separate executable Each component on a unique set of hardware processors All communications go through coupler Coupler communicates with all components maps (interpolates) data merges fields computes some fluxes has diagnostic, history, and restart capability
5 CCSM Platforms Currently support IBM Power3, Power4 SGI Origin O2k, O3K Nearly support HP/Compaq Future? Linux Vector Platform (Cray X1, NEC/Earth Simulator)
6 CCSM Resolution and Timing T42 resolution atm and land, 26 vertical levels in atm (128x64x26, 200k cells) 1 degree resolution ocean and ice, 40 vertical levels in ocean (320x384x40, 4M cells) On 100 processors of IBM power4 (8 processors/node, 1.3 Ghz clock, Colony): Model runs about 10 simulated years / day Requires about a month to run 300 years Science requirements set coupling frequency between models and data flow
7 CCSM Overview (part 2) F90 primarily Netcdf history files Binary restart files SCIDAC - DOE ESMF - NASA
8 cpl5 Shortcomings cpl5 is a shared memory application and uses Open_MP threading to achieve parallel computation efficiency cpl5 communication with models is not parallelized, does root-to-root communication only, requires gathers and scatters on distributed memory components Increasing resolutions or coupling frequencies could result in coupling performance bottlenecks cpl5 coupling is hard-wired to MPI and is not easily extensible as currently implemented
9 cpl6 Goals Create a fully parallel distributed memory coupler Implement M to N communication between components Improve communication performance to eliminate any potential future bottlenecks as a result of increased resolution Improve coupling interfaces, abstract communication method away from components Improve usability, flexibility, and extensibility of coupled system Improve overall performance
10 The Solution Build a new coupler framework with abstracted, parallel communication software in the foundation. Create a coupler application instantiation called cpl6 which reproduces the functionality of cpl5: cpl6 MCT* MPH** *Model Coupling Toolkit ** Multi-Component Handshaking Library
11 MCT: Model Coupling Toolkit www-unix.mcs.anl.gov/acpi/mct Major Attributes: Maintains model decomposition descriptors (e.g., global to local indexing) Inter- and intra- component communications and parallel data transfer (routing, rearranging) Flexible, extensible, indexible field storage Time averaging and accumulation Regridding (via sparse matrix vector multiply) MCT eases the construction of coupler computational cores and component-coupler interfaces.
12 MPH: Multi-Component Handshaking Library General Features: Built on MPI Establishes MPI communicator for each component Performs component name registration Allows resource allocation for each component Supports different execution modes MPH allows generalized communicator assignment, simplifying the component model communication and inter-component communication setup process.
13 cpl6 Architecture main program Layer 1a MCT wrapper control maindata msg map flux restart history diag Layer 1b coupling interface Layer 1c calendar utilities csmshare datatypes MCT derived objects MCT base objects MPEU utilities Layers 2-5 Vendor utilities
14 MCT Data Types Attribute Vector Fundamental data storage type 2d integer and real arrays (field,grid point) Strings for field names Global Seg Map Decomposition information Router M to N communication information Rearranger Local Communication information Smat Scattered mapping matrix data
15 cpl6 Data Types Infobuffer Vector of integers and reals, scalar data Domain cpl6 grid data type Name, Attribute Vector of grid data, GSMap Bundle Fundamental cpl6 storage data type, array data Name, Domain, Attribute Vector, Counter Contract Map Bundle, Infobuffer, Router Name, Smat, Domains, Rearranger
16 cpl6 Modules cpl_fields_mod Shared module, used by all components Sets field numbers and names Differentiates states and fluxes Naming convention allows automatic routing of data between components for simple fields cpl_interface_mod Simple interfaces, simple arguments Components only need to define contract data types. Within the interface, domains, routers, bundles, and contracts will be initialized on the component processors and used Components don t know about MCT, cpl6 data types, or the underlying communication method Extensible cpl6 design is more flexible and extensible than cpl5
17 cpl6 Design: Another view of CCSM hardware processors atm lnd ice ocn cpl coupling interface layer In cpl5, MPI was the coupling interface In cpl6, the coupler is now attached to each component Components unaware of coupling method Coupling work can be carried out on component processors Separate coupler no longer absolutely required
18 CCSM Performance: cpl5 vs cpl6 Merging Trivially parallel operation, cache usage important Production configuration timings Mapping Benchmarks tests for a2o, o2a mapping Effect of bundling multiple interpolations Comparison of mapping in production configuration Communication Focus on coupler to ice communication, high frequency, high resolution Unit tests and production configuration
19 Merging: cpl5 vs cpl6 Ocean Grid (122,880 points) 240 merging calls, 16 fields Production configuration secs cpl6 cpl number of pes
20 Merging Discussion Some serial performance optimization has been carried out in cpl6 cpl6 performs better than cpl5 for merging cpl6 scaling is not limited to a shared memory node cpl6 scaling is acceptable for trivially parallel operations Eliminated Open_MP overhead compared to cpl5 cpl6 will perform better than cpl5 and on a larger number of processors for simple parallel operations.
21 Mapping: cpl5 vs. cpl6 cpl5 mapping is a shared memory operation cpl6 mapping is currently distributed memory parallel only, and allows distributed, parallel mapping across several compute nodes cpl6 mapping requires MPI communication of data Rearrange data decomposition on source grid, then map to destination grid, OR. Map to destination grid, then rearrange to destination decomposition
22 secs Mapping: ocn -> atm Ocn (122,880 points) -> Atm (8192 points) 120 mapping calls, 9 fields number of pes cpl6 cpl5
23 Mapping: atm -> ocn effect of bundling mapping fields secs/field field 8 fields 16 fields field fields fields number of pes
24 secs Mapping: cpl5 vs cpl number of pes 10 simulated days Production configuration IBM-power4, 8 way nodes cpl6 cpl5
25 Mapping Discussion Mapping with cpl6 outperforms cpl5 at the same processor count in unit tests for a2o and o2a Bundling of fields for input into the mapping function is a clear winner over mapping single fields. Mapping performance is highly dependent upon (not shown): the grid sizes and shapes the data decomposition used the load imbalance created by the mapping procedure cpl6 mapping is slower or the same speed as cpl5 in comparisons on a production configuration. The cpl6 mapping method was originally designed to be easy to use, flexible, and extensible. As a result, there are special intermediate bundles and domains, extra array copies, associated rearranging, and array operations that do not use cache efficiently. We expect to improve mapping performance in the near future. cpl6 allows efficient, parallel, scalable mapping which can outperform cpl5 and provide flexibility for future CCSM configurations.
26 CCSM Communication: cpl5 vs cpl6 Coupler on 8pes Ice component on 16pes 240 transfers, 21 fields Production configuration cpl5 cpl6 copy=2.5s copy=1.3s copy=0.0s copy=0.0s comm=7.5s comm=5.0s comm=9.0s comm=8.0s gather =9.0s scatter =36.2s copy=1.0s copy=0.5s cpl5 communication=61.5s cpl6 communication=18.5s
27 cpl5 cpl6 Communication: cpl5 vs. cpl6 Coupler pes 1 (4) Ice pes 1 (16) cpl -> ice (secs) ice fields of size 122, transfers ice -> cpl (secs) apples and apples
28 Communication Discussion The cpl5 numbers illustrate the current CCSM2 performance on a single node using 4 pes and the ice model using 16 pes (4/16 configuration). Note that the cpl6<->ice single processor configuration simulates the root-to-root communication of cpl5<->ice in CCSM2.0. cpl6 is slower (1/1 configuration) due to the overhead created in pushing data thru the MCT. When cpl6 utilizes the full parallel capability of the 4/16 configuration (apples and apples), it clearly outperforms cpl5. cpl6 scales to multiple numbers of pes. cpl6 will be able to run on more pes than cpl5, will allow larger configurations of CCSM, and will improve communication performance.
29 Summary cpl6 is a distributed memory application, no threading implemented currently does M to N communication performance: Generally faster and better scaling than cpl5 Communication significantly faster than cpl5, potential important bottleneck eliminated Mapping in cpl6 (not MCT) requires further optimization is more flexible, usable and extensible scientific validation nearly complete, release expected soon
CPL6: THE NEW EXTENSIBLE, HIGH PERFORMANCE PARALLEL COUPLER FOR THE COMMUNITY CLIMATE SYSTEM MODEL
CPL6: THE NEW EXTENSIBLE, HIGH PERFORMANCE PARALLEL COUPLER FOR THE COMMUNITY CLIMATE SYSTEM MODEL Anthony P. Craig 1 Robert Jacob 2 Brian Kauffman 1 Tom Bettge 3 Jay Larson 2 Everest Ong 2 Chris Ding
More informationImproving climate model coupling through complete mesh representation
Improving climate model coupling through complete mesh representation Robert Jacob, Iulian Grindeanu, Vijay Mahadevan, Jason Sarich July 12, 2018 3 rd Workshop on Physics Dynamics Coupling Support: U.S.
More informationNCAR CCSM with Task Geometry Support in LSF
NCAR CCSM with Task Geometry Support in LSF Mike Page ScicomP 11 Conference Edinburgh, Scotland June 1, 2005 NCAR/CISL/SCD Consulting Services Group mpage@ucar.edu Overview Description of CCSM Task Geometry
More informationComparing Linux Clusters for the Community Climate System Model
Comparing Linux Clusters for the Community Climate System Model Matthew Woitaszek, Michael Oberg, and Henry M. Tufo Department of Computer Science University of Colorado, Boulder {matthew.woitaszek, michael.oberg}@colorado.edu,
More informationAdding MOAB to CIME s MCT driver
Adding MOAB to CIME s MCT driver Robert Jacob, Iulian Grindeanu, Vijay Mahadevan, Jason Sarich CESM SEWG winter meeting February 27, 2018 Support: DOE BER Climate Model Development and Validation project
More informationMultilingual Interfaces for Parallel Coupling in Multiphysics and Multiscale Systems
Multilingual Interfaces for Parallel Coupling in Multiphysics and Multiscale Systems Everest T. Ong 1, J. Walter Larson 23, Boyana Norris 2, Robert L. Jacob 2, Michael Tobis 4, and Michael Steder 4 1 Department
More informationNCAR CCSM with Task-Geometry Support in LSF
NCAR CCSM with Task-Geometry Support in LSF 1. Overview Mike Page and George Carr, Jr. National Center for Atmospheric Research, P.O. Box 3000, Boulder, CO 80307-3000, USA E-mail: mpage, gcarr@ucar.edu
More informationRegCM-ROMS Tutorial: Coupling RegCM-ROMS
RegCM-ROMS Tutorial: Coupling RegCM-ROMS Ufuk Utku Turuncoglu ICTP (International Center for Theoretical Physics) Earth System Physics Section - Outline Outline Information about coupling and ESMF Installation
More informationCESM2 Software Update. Mariana Vertenstein CESM Software Engineering Group
CESM2 Software Update Mariana Vertenstein CESM Software Engineering Group Outline CMIP6 Computational Performance Cheyenne Status CESM2 new user-friendly infrastructure features CIME New porting capabilities
More informationFMS: the Flexible Modeling System
FMS: the Flexible Modeling System Coupling Technologies for Earth System Modeling Toulouse FRANCE V. Balaji balaji@princeton.edu Princeton University 15 December 2010 Balaji (Princeton University) Flexible
More informationA simple OASIS interface for CESM E. Maisonnave TR/CMGC/11/63
A simple OASIS interface for CESM E. Maisonnave TR/CMGC/11/63 Index Strategy... 4 Implementation... 6 Advantages... 6 Current limitations... 7 Annex 1: OASIS3 interface implementation on CESM... 9 Annex
More informationEarly Evaluation of the Cray X1 at Oak Ridge National Laboratory
Early Evaluation of the Cray X1 at Oak Ridge National Laboratory Patrick H. Worley Thomas H. Dunigan, Jr. Oak Ridge National Laboratory 45th Cray User Group Conference May 13, 2003 Hyatt on Capital Square
More informationCommon Infrastructure for Modeling Earth (CIME) and MOM6. Mariana Vertenstein CESM Software Engineering Group
Common Infrastructure for Modeling Earth (CIME) and MOM6 Mariana Vertenstein CESM Software Engineering Group Outline What is CIME? New CIME coupling infrastructure and MOM6 CESM2/DART Data Assimilation
More informationExtending scalability of the community atmosphere model
Journal of Physics: Conference Series Extending scalability of the community atmosphere model To cite this article: A Mirin and P Worley 2007 J. Phys.: Conf. Ser. 78 012082 Recent citations - Evaluation
More informationIntroduction to Regional Earth System Model (RegESM)
Introduction to Regional Earth System Model (RegESM) Ufuk Turuncoglu Istanbul Technical University Informatics Institute 14/05/2014, 7th ICTP Workshop on the Theory and Use of Regional Climate Models Outline
More informationESMF. Earth System Modeling Framework. Carsten Lemmen. Schnakenbek, 17 Sep /23
1/23 ESMF Earth System Modeling Framework Carsten Lemmen Schnakenbek, 17 Sep 2013 2/23 Why couple? GEOS5 vorticity We live in a coupled world combine, extend existing models (domains + processes) reuse
More informationDr. John Dennis
Dr. John Dennis dennis@ucar.edu June 23, 2011 1 High-resolution climate generates a large amount of data! June 23, 2011 2 PIO update and Lustre optimizations How do we analyze high-resolution climate data
More informationA Draft Specification for Parallel Coupling Infrastructure (PCI)
A Draft Specification for Parallel Coupling Infrastructure (PCI) J. Walter Larson and Boyana Norris Math + Computer Science Division Argonne National Laboratory Overview Parallel coupling problem in brief
More informationA Software Developing Environment for Earth System Modeling. Depei Qian Beihang University CScADS Workshop, Snowbird, Utah June 27, 2012
A Software Developing Environment for Earth System Modeling Depei Qian Beihang University CScADS Workshop, Snowbird, Utah June 27, 2012 1 Outline Motivation Purpose and Significance Research Contents Technology
More informationC-Coupler2: a flexible and user-friendly community coupler for model coupling and nesting
https://doi.org/10.5194/gmd-11-3557-2018 Author(s) 2018. This work is distributed under the Creative Commons Attribution 4.0 License. C-Coupler2: a flexible and user-friendly community coupler for model
More informationCESM1 for Deep Time Paleoclimate
CESM1 for Deep Time Paleoclimate Christine A. Shields NCAR Thanks to Mariana Vertenstein, Nancy Norton, Gokhan Danabasoglu, Brian Kauffman, Erik Kluzek, Sam Levis, and Nan Rosenbloom NCAR is sponsored
More informationGPU-optimized computational speed-up for the atmospheric chemistry box model from CAM4-Chem
GPU-optimized computational speed-up for the atmospheric chemistry box model from CAM4-Chem Presenter: Jian Sun Advisor: Joshua S. Fu Collaborator: John B. Drake, Qingzhao Zhu, Azzam Haidar, Mark Gates,
More informationGetting up and running with CESM Cécile Hannay Climate and Global Dynamics (CGD), NCAR
Getting up and running with CESM Cécile Hannay Climate and Global Dynamics (CGD), NCAR NCAR is sponsored by the National Science Foundation Why CESM? State of the Art Climate Model Widely used by the Climate
More informationCESM Tutorial. NCAR Climate and Global Dynamics Laboratory. CESM 2.0 CESM1.2.x and previous (see earlier tutorials) Alice Bertini
CESM Tutorial NCAR Climate and Global Dynamics Laboratory CESM 2.0 CESM1.2.x and previous (see earlier tutorials) Alice Bertini NCAR is sponsored by the National Science Foundation Outline The CESM webpage
More informationScalaIOTrace: Scalable I/O Tracing and Analysis
ScalaIOTrace: Scalable I/O Tracing and Analysis Karthik Vijayakumar 1, Frank Mueller 1, Xiaosong Ma 1,2, Philip C. Roth 2 1 Department of Computer Science, NCSU 2 Computer Science and Mathematics Division,
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI Miscellaneous Guidelines Nick Maclaren Computing Service nmm1@cam.ac.uk, ext. 34761 March 2010 Programming with MPI p. 2/?? Summary This is a miscellaneous
More informationCESM (Community Earth System Model) Performance Benchmark and Profiling. August 2011
CESM (Community Earth System Model) Performance Benchmark and Profiling August 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,
More information16 th Annual CESM Workshop s Software Engineering Working Group. Parallel Analysis of GeOscience Data Status and Future
16 th Annual CESM Workshop s Software Engineering Working Group Parallel Analysis of GeOscience Data Status and Future Jeff Daily PI: Karen Schuchardt, in collaboration with Colorado State University s
More informationBuilding a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners
Building a Global Data Federation for Climate Change Science The Earth System Grid (ESG) and International Partners 24th Forum ORAP Cite Scientifique; Lille, France March 26, 2009 Don Middleton National
More informationNew Features of HYCOM. Alan J. Wallcraft Naval Research Laboratory. 16th Layered Ocean Model Workshop
New Features of HYCOM Alan J. Wallcraft Naval Research Laboratory 16th Layered Ocean Model Workshop May 23, 2013 Mass Conservation - I Mass conservation is important for climate studies It is a powerfull
More informationDetermining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace
Determining Optimal MPI Process Placement for Large- Scale Meteorology Simulations with SGI MPIplace James Southern, Jim Tuccillo SGI 25 October 2016 0 Motivation Trend in HPC continues to be towards more
More informationThe DOE Parallel Climate Model (PCM): The Computational Highway and Backroads
The DOE Parallel Climate Model (PCM): The Computational Highway and Backroads Thomas Bettge, Anthony Craig, Rodney James, Vincent Wayland, and Gary Strand National Center for Atmospheric Research, 1850
More informationAn evaluation of the Performance and Scalability of a Yellowstone Test-System in 5 Benchmarks
An evaluation of the Performance and Scalability of a Yellowstone Test-System in 5 Benchmarks WRF Model NASA Parallel Benchmark Intel MPI Bench My own personal benchmark HPC Challenge Benchmark Abstract
More informationHPC Performance Advances for Existing US Navy NWP Systems
HPC Performance Advances for Existing US Navy NWP Systems Timothy Whitcomb, Kevin Viner Naval Research Laboratory Marine Meteorology Division Monterey, CA Matthew Turner DeVine Consulting, Monterey, CA
More informationFISOC: Framework for Ice Sheet Ocean Coupling
Rupert Gladstone, Ben Galton-Fenzi, David Gwyther, Lenneke Jong Contents Third party coupling software: Earth System Modelling Framework (ESMF). FISOC overview: aims and design ethos. FISOC overview: code
More informationThe Earth System Modeling Framework (and Beyond)
The Earth System Modeling Framework (and Beyond) Fei Liu NOAA Environmental Software Infrastructure and Interoperability http://www.esrl.noaa.gov/nesii/ March 27, 2013 GEOSS Community ESMF is an established
More informationEnvironmental Modelling: Crossing Scales and Domains. Bert Jagers
Environmental Modelling: Crossing Scales and Domains Bert Jagers 3 rd Workshop on Coupling Technologies for Earth System Models Manchester, April 20-22, 2015 https://www.earthsystemcog.org/projects/cw2015
More informationThe Icosahedral Nonhydrostatic (ICON) Model
The Icosahedral Nonhydrostatic (ICON) Model Scalability on Massively Parallel Computer Architectures Florian Prill, DWD + the ICON team 15th ECMWF Workshop on HPC in Meteorology October 2, 2012 ICON =
More informationParallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike Folk, Leon Arber The HDF Group Champaign, IL 61820
More informationPerformance of the 3D-Combustion Simulation Code RECOM-AIOLOS on IBM POWER8 Architecture. Alexander Berreth. Markus Bühler, Benedikt Anlauf
PADC Anual Workshop 20 Performance of the 3D-Combustion Simulation Code RECOM-AIOLOS on IBM POWER8 Architecture Alexander Berreth RECOM Services GmbH, Stuttgart Markus Bühler, Benedikt Anlauf IBM Deutschland
More informationCMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)
CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can
More informationExperiences with Porting CESM to ARCHER
Experiences with Porting CESM to ARCHER ARCHER Technical Forum Webinar, 25th February, 2015 Gavin J. Pringle 25 February 2015 ARCHER Technical Forum Webinar Overview of talk Overview of the associated
More informationFast Methods with Sieve
Fast Methods with Sieve Matthew G Knepley Mathematics and Computer Science Division Argonne National Laboratory August 12, 2008 Workshop on Scientific Computing Simula Research, Oslo, Norway M. Knepley
More informationHow to overcome common performance problems in legacy climate models
How to overcome common performance problems in legacy climate models Jörg Behrens 1 Contributing: Moritz Handke 1, Ha Ho 2, Thomas Jahns 1, Mathias Pütz 3 1 Deutsches Klimarechenzentrum (DKRZ) 2 Helmholtz-Zentrum
More informationPerformance Tools (Paraver/Dimemas)
www.bsc.es Performance Tools (Paraver/Dimemas) Jesús Labarta, Judit Gimenez BSC Enes workshop on exascale techs. Hamburg, March 18 th 2014 Our Tools! Since 1991! Based on traces! Open Source http://www.bsc.es/paraver!
More informationMPI and OpenMP Paradigms on Cluster of SMP Architectures: the Vacancy Tracking Algorithm for Multi-Dimensional Array Transposition
MPI and OpenMP Paradigms on Cluster of SMP Architectures: the Vacancy Tracking Algorithm for Multi-Dimensional Array Transposition Yun He and Chris H.Q. Ding NERSC Division, Lawrence Berkeley National
More informationOverlapping Computation and Communication for Advection on Hybrid Parallel Computers
Overlapping Computation and Communication for Advection on Hybrid Parallel Computers James B White III (Trey) trey@ucar.edu National Center for Atmospheric Research Jack Dongarra dongarra@eecs.utk.edu
More informationOASIS3-MCT, a coupler for climate modelling
OASIS3-MCT, a coupler for climate modelling S. Valcke, CERFACS OASIS historical overview OASIS3-MCT: Application Programming Interface Parallel Decompositions supported Communication Interpolations et
More informationHPC parallelization of oceanographic models via high-level techniques
HPC parallelization of oceanographic models via high-level techniques Piero Lanucara, Vittorio Ruggiero CASPUR Vincenzo Artale, Andrea Bargagli, Adriana Carillo, Gianmaria Sannino ENEA Casaccia Roma, Italy
More informationAchieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation
Achieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation Michael Lange 1 Gerard Gorman 1 Michele Weiland 2 Lawrence Mitchell 2 Xiaohu Guo 3 James Southern 4 1 AMCG, Imperial College
More informationImplementation and Analysis of Nonblocking Collective Operations on SCI Networks. Boris Bierbaum, Thomas Bemmerl
Implementation and Analysis of Nonblocking Collective Operations on SCI Networks Christian Kaiser Torsten Hoefler Boris Bierbaum, Thomas Bemmerl Scalable Coherent Interface (SCI) Ringlet: IEEE Std 1596-1992
More informationComparison of XT3 and XT4 Scalability
Comparison of XT3 and XT4 Scalability Patrick H. Worley Oak Ridge National Laboratory CUG 2007 May 7-10, 2007 Red Lion Hotel Seattle, WA Acknowledgements Research sponsored by the Climate Change Research
More informationDesigning Parallel Programs. This review was developed from Introduction to Parallel Computing
Designing Parallel Programs This review was developed from Introduction to Parallel Computing Author: Blaise Barney, Lawrence Livermore National Laboratory references: https://computing.llnl.gov/tutorials/parallel_comp/#whatis
More informationPorting and Optimizing the COSMOS coupled model on Power6
Porting and Optimizing the COSMOS coupled model on Power6 Luis Kornblueh Max Planck Institute for Meteorology November 5, 2008 L. Kornblueh, MPIM () echam5 November 5, 2008 1 / 21 Outline 1 Introduction
More informationKepler Scientific Workflow and Climate Modeling
Kepler Scientific Workflow and Climate Modeling Ufuk Turuncoglu Istanbul Technical University Informatics Institute Cecelia DeLuca Sylvia Murphy NOAA/ESRL Computational Science and Engineering Dept. NESII
More informationOptimisation Myths and Facts as Seen in Statistical Physics
Optimisation Myths and Facts as Seen in Statistical Physics Massimo Bernaschi Institute for Applied Computing National Research Council & Computer Science Department University La Sapienza Rome - ITALY
More informationThe CIME Case Control System
The CIME Case Control System An Object Oriented Python Data Driven Workflow Control System for Earth System Models Jim Edwards 22 nd Annual Community Earth System Model Workshop Boulder, CO 19-22 June
More informationA comparative study of coupling frameworks: the MOM case study
A comparative study of coupling frameworks: the MOM case study V. Balaji Princeton University and NOAA/GFDL Giang Nong and Shep Smithline RSIS Inc. and NOAA/GFDL Rene Redler NEC Europe Ltd ECMWF High Performance
More informationECE 574 Cluster Computing Lecture 13
ECE 574 Cluster Computing Lecture 13 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 21 March 2017 Announcements HW#5 Finally Graded Had right idea, but often result not an *exact*
More informationDynamic Load Balancing for Weather Models via AMPI
Dynamic Load Balancing for Eduardo R. Rodrigues IBM Research Brazil edrodri@br.ibm.com Celso L. Mendes University of Illinois USA cmendes@ncsa.illinois.edu Laxmikant Kale University of Illinois USA kale@cs.illinois.edu
More informationToward Automated Application Profiling on Cray Systems
Toward Automated Application Profiling on Cray Systems Charlene Yang, Brian Friesen, Thorsten Kurth, Brandon Cook NERSC at LBNL Samuel Williams CRD at LBNL I have a dream.. M.L.K. Collect performance data:
More informationParallel Programming Concepts. Tom Logan Parallel Software Specialist Arctic Region Supercomputing Center 2/18/04. Parallel Background. Why Bother?
Parallel Programming Concepts Tom Logan Parallel Software Specialist Arctic Region Supercomputing Center 2/18/04 Parallel Background Why Bother? 1 What is Parallel Programming? Simultaneous use of multiple
More informationACCELERATING THE PRODUCTION OF SYNTHETIC SEISMOGRAMS BY A MULTICORE PROCESSOR CLUSTER WITH MULTIPLE GPUS
ACCELERATING THE PRODUCTION OF SYNTHETIC SEISMOGRAMS BY A MULTICORE PROCESSOR CLUSTER WITH MULTIPLE GPUS Ferdinando Alessi Annalisa Massini Roberto Basili INGV Introduction The simulation of wave propagation
More informationINTERTWinE workshop. Decoupling data computation from processing to support high performance data analytics. Nick Brown, EPCC
INTERTWinE workshop Decoupling data computation from processing to support high performance data analytics Nick Brown, EPCC n.brown@epcc.ed.ac.uk Met Office NERC Cloud model (MONC) Uses Large Eddy Simulation
More informationCUG Talk. In-situ data analytics for highly scalable cloud modelling on Cray machines. Nick Brown, EPCC
CUG Talk In-situ analytics for highly scalable cloud modelling on Cray machines Nick Brown, EPCC nick.brown@ed.ac.uk Met Office NERC Cloud model (MONC) Uses Large Eddy Simulation for modelling clouds &
More informationIntegrating Analysis and Computation with Trios Services
October 31, 2012 Integrating Analysis and Computation with Trios Services Approved for Public Release: SAND2012-9323P Ron A. Oldfield Scalable System Software Sandia National Laboratories Albuquerque,
More informationFlexible Coupling for Performance. Chris Armstrong Rupert Ford Graham Riley
Flexible Coupling for Performance Chris Armstrong Rupert Ford Graham Riley Overview Introduction Deployment flexibility (BFG1) Argument Passing (BFG2) GENIE results Conclusions Introduction Flexible Coupling
More informationParallel Programming with MPI and OpenMP
Parallel Programming with MPI and OpenMP Michael J. Quinn Chapter 6 Floyd s Algorithm Chapter Objectives Creating 2-D arrays Thinking about grain size Introducing point-to-point communications Reading
More informationCCSM3.0 User s Guide
Community Climate System Model National Center for Atmospheric Research, Boulder, CO CCSM3.0 User s Guide Mariana Vertenstein, Tony Craig, Tom Henderson, Sylvia Murphy, George R Carr Jr and Nancy Norton
More informationCenter for Scalable Application Development Software (CScADS): Automatic Performance Tuning Workshop
Center for Scalable Application Development Software (CScADS): Automatic Performance Tuning Workshop http://cscads.rice.edu/ Discussion and Feedback CScADS Autotuning 07 Top Priority Questions for Discussion
More informationNamelist and Code Modifications Part 1: Namelist Modifications Part 2: Code Modifications Part 3: Quiz
Namelist and Code Modifications Part 1: Namelist Modifications Part 2: Code Modifications Part 3: Quiz Cecile Hannay, CAM Science Liaison Atmospheric Modeling and Predictability Section Climate and Global
More informationOutline. Execution Environments for Parallel Applications. Supercomputers. Supercomputers
Outline Execution Environments for Parallel Applications Master CANS 2007/2008 Departament d Arquitectura de Computadors Universitat Politècnica de Catalunya Supercomputers OS abstractions Extended OS
More informationA TIMING AND SCALABILITY ANALYSIS OF THE PARALLEL PERFORMANCE OF CMAQ v4.5 ON A BEOWULF LINUX CLUSTER
A TIMING AND SCALABILITY ANALYSIS OF THE PARALLEL PERFORMANCE OF CMAQ v4.5 ON A BEOWULF LINUX CLUSTER Shaheen R. Tonse* Lawrence Berkeley National Lab., Berkeley, CA, USA 1. INTRODUCTION The goal of this
More informationOutline. Motivation Parallel k-means Clustering Intel Computing Architectures Baseline Performance Performance Optimizations Future Trends
Collaborators: Richard T. Mills, Argonne National Laboratory Sarat Sreepathi, Oak Ridge National Laboratory Forrest M. Hoffman, Oak Ridge National Laboratory Jitendra Kumar, Oak Ridge National Laboratory
More informationLecture 27 Programming parallel hardware" Suggested reading:" (see next slide)"
Lecture 27 Programming parallel hardware" Suggested reading:" (see next slide)" 1" Suggested Readings" Readings" H&P: Chapter 7 especially 7.1-7.8" Introduction to Parallel Computing" https://computing.llnl.gov/tutorials/parallel_comp/"
More informationPredicting Program Phases and Defending against Side-Channel Attacks using Hardware Performance Counters
Predicting Program Phases and Defending against Side-Channel Attacks using Hardware Performance Counters Junaid Nomani and Jakub Szefer Computer Architecture and Security Laboratory Yale University junaid.nomani@yale.edu
More informationUnderstanding MPI on Cray XC30
Understanding MPI on Cray XC30 MPICH3 and Cray MPT Cray MPI uses MPICH3 distribution from Argonne Provides a good, robust and feature rich MPI Cray provides enhancements on top of this: low level communication
More informationRegCM-ROMS Tutorial: Introduction to ROMS Ocean Model
RegCM-ROMS Tutorial: Introduction to ROMS Ocean Model Ufuk Utku Turuncoglu ICTP (International Center for Theoretical Physics) Earth System Physics Section - Outline Outline Introduction Grid generation
More informationExploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers
Exploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers Abhinav Sarje, Douglas W. Jacobsen, Samuel W. Williams, Todd Ringler, Leonid Oliker Lawrence Berkeley National Laboratory {asarje,swwilliams,loliker}@lbl.gov
More informationAmazon Web Services: Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud
Amazon Web Services: Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud Summarized by: Michael Riera 9/17/2011 University of Central Florida CDA5532 Agenda
More informationCompute Node Linux: Overview, Progress to Date & Roadmap
Compute Node Linux: Overview, Progress to Date & Roadmap David Wallace Cray Inc ABSTRACT: : This presentation will provide an overview of Compute Node Linux(CNL) for the CRAY XT machine series. Compute
More informationParticle-in-Cell Simulations on Modern Computing Platforms. Viktor K. Decyk and Tajendra V. Singh UCLA
Particle-in-Cell Simulations on Modern Computing Platforms Viktor K. Decyk and Tajendra V. Singh UCLA Outline of Presentation Abstraction of future computer hardware PIC on GPUs OpenCL and Cuda Fortran
More informationNew Features of HYCOM. Alan J. Wallcraft Naval Research Laboratory. 14th Layered Ocean Model Workshop
New Features of HYCOM Alan J. Wallcraft Naval Research Laboratory 14th Layered Ocean Model Workshop August 22, 2007 HYCOM 2.2 (I) Maintain all features of HYCOM 2.1 Orthogonal curvilinear grids Can emulate
More informationHybrid Strategies for the NEMO Ocean Model on! Many-core Processors! I. Epicoco, S. Mocavero, G. Aloisio! University of Salento & CMCC, Italy!
Hybrid Strategies for the NEMO Ocean Model on! Many-core Processors! I. Epicoco, S. Mocavero, G. Aloisio! University of Salento & CMCC, Italy! 2012 Programming weather, climate, and earth-system models
More informationCray XC Scalability and the Aries Network Tony Ford
Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?
More informationThe Bespoke Framework Generator (BFG) Rupert Ford Graham Riley
The Bespoke Framework Generator (BFG) Rupert Ford Graham Riley Overview What is BFG? How is it implemented? Current status Example use Future work 2 What is BFG? Bespoke Framework Generator BFG takes as
More informationCESM Projects Using ESMF and NUOPC Conventions
CESM Projects Using ESMF and NUOPC Conventions Cecelia DeLuca NOAA ESRL/University of Colorado CESM Annual Workshop June 18, 2014 Outline ESMF development update Joint CESM-ESMF projects ESMF applications:
More informationDetection and Analysis of Iterative Behavior in Parallel Applications
Detection and Analysis of Iterative Behavior in Parallel Applications Karl Fürlinger and Shirley Moore Innovative Computing Laboratory, Department of Electrical Engineering and Computer Science, University
More informationOASIS4 User Guide. PRISM Report No 3
PRISM Project for Integrated Earth System Modelling An Infrastructure Project for Climate Research in Europe funded by the European Commission under Contract EVR1-CT2001-40012 OASIS4 User Guide Edited
More informationPrinciples of Parallel Algorithm Design: Concurrency and Mapping
Principles of Parallel Algorithm Design: Concurrency and Mapping John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 3 17 January 2017 Last Thursday
More informationLS-DYNA Scalability Analysis on Cray Supercomputers
13 th International LS-DYNA Users Conference Session: Computing Technology LS-DYNA Scalability Analysis on Cray Supercomputers Ting-Ting Zhu Cray Inc. Jason Wang LSTC Abstract For the automotive industry,
More informationPerformance of Multicore LUP Decomposition
Performance of Multicore LUP Decomposition Nathan Beckmann Silas Boyd-Wickizer May 3, 00 ABSTRACT This paper evaluates the performance of four parallel LUP decomposition implementations. The implementations
More informationBuilding Ensemble-Based Data Assimilation Systems for Coupled Models
Building Ensemble-Based Data Assimilation Systems for Coupled s Lars Nerger Alfred Wegener Institute for Polar and Marine Research Bremerhaven, Germany Overview How to simplify to apply data assimilation?
More informationIntroducing a new tool set. Sean Patrick Santos. 19th Annual CESM Workshop, 2014
Unit Testing in CESM Introducing a new tool set Sean Patrick Santos National Center for Atmospheric Research 19th Annual CESM Workshop, 2014 Overview Outline 1 Overview 2 Workflows Running Unit Tests Creating
More informationFirst Experiences with Application Development with Fortran Damian Rouson
First Experiences with Application Development with Fortran 2018 Damian Rouson Overview Fortran 2018 in a Nutshell ICAR & Coarray ICAR WRF-Hydro Results Conclusions www.yourwebsite.com Overview Fortran
More informationb40.lgm21ka.1deg.003
b40.lgm21ka.1deg.003 Contents: Run Specifications Run Checklist Comments Status: completed Run Specifications =================== General Information =================== Purpose of Run: IPCC TIER1 CCSM4
More informationI1850Clm50SpG is the short name for 1850_DATM%GSWP3v1_CLM50%SP_SICE_SOCN_MOSART_CISM2%EVOLVE_SWAV.
In this exercise, you will use CESM to compute the surface mass balance of the Greenland ice sheet. You will make a simple code modification to perform a crude global warming or cooling experiment. Create
More informationThe IBM Blue Gene/Q: Application performance, scalability and optimisation
The IBM Blue Gene/Q: Application performance, scalability and optimisation Mike Ashworth, Andrew Porter Scientific Computing Department & STFC Hartree Centre Manish Modani IBM STFC Daresbury Laboratory,
More informationCommunication and Optimization Aspects of Parallel Programming Models on Hybrid Architectures
Communication and Optimization Aspects of Parallel Programming Models on Hybrid Architectures Rolf Rabenseifner rabenseifner@hlrs.de Gerhard Wellein gerhard.wellein@rrze.uni-erlangen.de University of Stuttgart
More informationHDF5 I/O Performance. HDF and HDF-EOS Workshop VI December 5, 2002
HDF5 I/O Performance HDF and HDF-EOS Workshop VI December 5, 2002 1 Goal of this talk Give an overview of the HDF5 Library tuning knobs for sequential and parallel performance 2 Challenging task HDF5 Library
More information