NCAR Computation and Information Systems Laboratory (CISL) Facilities and Support Overview

Similar documents
Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende

Introduction to Cheyenne. 12 January, 2017 Consulting Services Group Brian Vanderwende

CAM Tutorial. configure, build & run. Dani Coleman July

Running the model in production mode: using the queue.

Introduction to High-Performance Computing (HPC)

Data Management Components for a Research Data Archive

NCL on Yellowstone. Mary Haley October 22, 2014 With consulting support from B.J. Smith. Sponsored in part by the National Science Foundation

Laohu cluster user manual. Li Changhua National Astronomical Observatory, Chinese Academy of Sciences 2011/12/26

Outline. March 5, 2012 CIRMMT - McGill University 2

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

The GPU-Cluster. Sandra Wienke Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

Graham vs legacy systems

The Red Storm System: Architecture, System Update and Performance Analysis

RWTH GPU-Cluster. Sandra Wienke March Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

How to Use a Supercomputer - A Boot Camp

OBTAINING AN ACCOUNT:

Cluster Network Products

NCAR s Data-Centric Supercomputing Environment Yellowstone. November 28, 2011 David L. Hart, CISL

XSEDE New User Tutorial

Implementing a Digital Video Archive Based on the Sony PetaSite and XenData Software

STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2. (Mouse over to the left to see thumbnails of all of the slides)

Our Workshop Environment

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology

Using LSF with Condor Checkpointing

EIC system user manual

The JANUS Computing Environment

New User Seminar: Part 2 (best practices)

SuperMike-II Launch Workshop. System Overview and Allocations

Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS

Our Workshop Environment

Overcoming Obstacles to Petabyte Archives

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide

Siemens PLM Software. HEEDS MDO Setting up a Windows-to- Linux Compute Resource.

PPCES 2016: MPI Lab March 2016 Hristo Iliev, Portions thanks to: Christian Iwainsky, Sandra Wienke

Current Progress of Grid Project in KMA

Moving From Reactive to Proactive Storage Management with an On-demand Cloud Solution

HPC DOCUMENTATION. 3. Node Names and IP addresses:- Node details with respect to their individual IP addresses are given below:-

Scheduling in SAS 9.2

Our Workshop Environment

XSEDE New User Tutorial

Introduction to High-Performance Computing (HPC)

Wednesday, August 10, 11. The Texas Advanced Computing Center Michael B. Gonzales, Ph.D. Program Director, Computational Biology

Introduction to High-Performance Computing (HPC)

NCAR s Data-Centric Supercomputing Environment Yellowstone. November 29, 2011 David L. Hart, CISL

NCAR CCSM with Task Geometry Support in LSF

Linux Essentials. Smith, Roderick W. Table of Contents ISBN-13: Introduction xvii. Chapter 1 Selecting an Operating System 1

Our Workshop Environment

RHRK-Seminar. High Performance Computing with the Cluster Elwetritsch - II. Course instructor : Dr. Josef Schüle, RHRK

Our Workshop Environment

Introduction to High Performance Computing and an Statistical Genetics Application on the Janus Supercomputer. Purpose

Tirant User s Guide. Valencia, April 21, 2008

Research Collection. WebParFE A web interface for the high performance parallel finite element solver ParFE. Report. ETH Library

Batch Systems. Running calculations on HPC resources

Introduction to HPC Using zcluster at GACRC

Introduction to BioHPC

NCAR Globally Accessible Data Environment (GLADE) Updated: 15 Feb 2017

Introduction to High Performance Computing Using Sapelo2 at GACRC

Infor Lawson on IBM i 7.1 and IBM POWER7+

Data Movement and Storage. 04/07/09 1

Introduction to HPC Using zcluster at GACRC

Exercise: Calling LAPACK

KISTI TACHYON2 SYSTEM Quick User Guide

Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

Distributed File Systems Part IV. Hierarchical Mass Storage Systems

Using the computational resources at the GACRC

CSC Supercomputing Environment

XSEDE New User Tutorial

Using the IBM Opteron 1350 at OSC. October 19-20, 2010

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU

Cluster Clonetroop: HowTo 2014

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

XSEDE New User Training. Ritu Arora November 14, 2014

COSC 6385 Computer Architecture. - Homework

High Performance Computing Cluster Advanced course

Integrating Fibre Channel Storage Devices into the NCAR MSS

Table of Contents. Table of Contents Job Manager for remote execution of QuantumATK scripts. A single remote machine

UMass High Performance Computing Center

XSEDE New User Tutorial

Lab 1 Introduction to UNIX and C

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Introduction to Supercomputing at. Kate Hedstrom, Arctic Region Supercomputing Center (ARSC) Jan, 2004

Cornell Theory Center 1

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

Beginner's Guide for UK IBM systems

HPC Workshop. Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing

Experiences in Managing Resources on a Large Origin3000 cluster

2-D Wave Equation Suggested Project

Introduction to running C based MPI jobs on COGNAC. Paul Bourke November 2006

Introduction to High Performance Computing (HPC) Resources at GACRC

Fixed Bugs for IBM Platform LSF Version

High Performance Computing (HPC) Using zcluster at GACRC

RESEARCH DATA DEPOT AT PURDUE UNIVERSITY

CS Fundamentals of Programming II Fall Very Basic UNIX

XenData Product Brief: SX-550 Series Servers for LTO Archives

Our Workshop Environment

Change and Configuration Management Administration

The Architecture and the Application Performance of the Earth Simulator

Some popular Operating Systems include Linux Operating System, Windows Operating System, VMS, OS/400, AIX, z/os, etc.

Transcription:

NCAR Computation and Information Systems Laboratory (CISL) Facilities and Support Overview NCAR ASP 2008 Summer Colloquium on Numerical Techniques for Global Atmospheric Models June 2, 2008 Mike Page - NCAR/CISL/HSS/CSG Consulting Services Group

CISL s Mission for User Support CISL will provide a balanced set of services to enable researchers to securely, easily, and effectively utilize community resources CISL Strategic Plan (2005-2009) CISL also supports special colloquia, workshops and computational campaigns; giving these groups of users special privileges and access to facilities and services above and beyond normal service levels.

CISL Computing Systems At NCAR/CISL you ll find world-class facilities supporting leading-edge science through high-performance computing. Navigation and usage of the facilities requires a basic familiarity with a number of the functional aspects of the facility. Computing systems Allocations Usage Batch Interactive Security Data Archival MSS User Support

CISL Computing Systems Bluevista IBM eserver p575 #98 on Top 500 list (Nov. 2005) 624 Power5 processors with 1.9-GHz clock, DCMs Four floating-point operations per cycle 4.74 TFLOPS peak processing SMT technology 72 8-way batch nodes 16 Gb shared memory on each node AIX operating system IBM XL Compiler Suite; TotalView debugger LSF Batch system 3 GB $HOME quota 240 GB /ptmp quota

Allocations Allocations are granted in General Accounting Units (GAUs) Each of the colloquium modeling groups has a project number for which 6000 GAUs are available. Monitor GAU usage though the CISL portal: https://portal.scd.ucar.edu:8443/scd-portal (requires UCAS password) Charges are assessed overnight and will be available for review For runs that complete by midnight. GAUs charged = wallclock hours used * number of nodes used * number of processors in that node * computer factor * queue charging factor The computer factor for bluevista is 0.87. The queue charging factor for dycore is 1.0

Batch and Interactive Usage Batch Usage LSF - Load Sharing Facility Fair share scheduler dycore queue for batch runs share queue for analysis runs Interactive batch seldom used Interactive use through Unix shell commands

LSF Batch Submission Job submission bsub < script Submits the file script to LSF Monitor jobs bjobs shows jobs you have running and pending in the system bjobs -u all bjobs -q dycore -u all bhist -n 3 -a shows jobs submitted and completed over last few days System batch load batchview Shows all jobs for all users

LSF Example #!/bin/ksh # # LSF batch script to run an MPI application # #BSUB -n 24 # number of mpi tasks #BSUB -R "span[ptile=8]" # run 8 tasks per node (non-smt) ##BSUB -R span[ptile=16] # run 16 task per node (smt) #BSUB -P xxxxxxxx # Project xxxxxxxx #BSUB -J mpilsf.test # job name #BSUB -o mpilsf.%j.out # output filename #BSUB -e mpilsf.%j.err # error filename #BSUB -W 0:10 # 10 minutes wall clock time #BSUB -q dycore # queue # Fortran example mpxlf_r -o mpi_samp_f mpi.f mpirun.lsf./mpi_samp_f More examples in /usr/local/examples/lsf/batch

Useful Utilities Change mpirun.lsf./mpi_samp_f to timex mpirun.lsf./mpi_samp_f for information on execution time or to: export MP_LABELIO=yes mpirun.lsf /contrib/bin/job_memusage.exe./mpi_samp_f For information on memory usage (This will help you decide whether you can use SMT or not)

Security CISL Firewall Enter through roy.ucar.edu ssh only; telnet not allowed ssh, ssh -X, ssh -Y Cryptocard You must have one to access bluevista Usage Resynchronization Complete information from CISL web pages http://www.cisl.ucar.edu

NCAR Mass Store Subsystem - (MSS) Currently stores 5 petabytes of data Library of Congress (printed collection) 10 Terabytes = 0.01 Petabytes Mass Store holds 500 * Library of Congress Growing by 2-6 Terabytes of new data per day Data holdings increasing exponentially 1986-2 Tb 1997-100 Tb 2002-1000 Tb 2004-2000 Tb 2008-5000 Tb

Charges for MSS Usage The current MSS charging formula is: GAUs charged =.0837*R +.0012*A + N(.1195*W +.205*S) where: R = Gigabytes read W = Gigabytes created or written A = Number of disk drive or tape cartridge accesses S = Data stored, in gigabyte-years N = Number of copies of file: = 1 if economy reliability selected = 2 if standard reliability selected

NCAR Mass Store Subsystem Keys to Efficient Usage Be selective in the data you retain on the MSS Avoid repetitive reads/writes of same file Choose class of service and retention periods according to value of data Recommended file sizes Transfer a few large files rather than a large number of small files Maximum file size is 12 Gb Use tar to collect small files for a single transfer

Mass Store Usage Will NCAR maintain my data indefinitely? If you don t retain your account - 1 year If you retain your account - ongoing Using the Mass Store is expensive (GAUs) Consider offloading data (DVD) Create your own transportable media (except extreme cases) Use scp/sftp for data accessible from an SCD super Including divisional filesystems Use MSS ftp server http://www.scd.ucar.edu/docs/mss/ftp.html

File purge policies If: you are no longer active on any project your project closes you are no longer employed by UCAR/NCAR you are doing periodic maintenance of your MSS files You can: delete your MSS files yourself request that the CISL delete your files change the project number to an active account transfer ownership of the files to another user transfer the data between the MSS and other media 1 Terabyte > 300 DVDs transfer data to another network location

Mass Store Access Command Line and Batch Scripts http://www.cisl.ucar.edu/main/mss.html msrcp [-a[sync] -cl[ass] cos -n[oreplace] -pe[riod] n \ -pr[oject] proj_num -rpwd rpass -wpwd wpass -R -V[ersion]] \ source_file [source_file...] target msls [-project proj] [-class cos] [-full] \ [-CFPRSTVacdflpqrtuxz1] [path] msmv [-project proj] [-f] [-period ret_period] (1)password options \ file1 file2 msmv [-project proj] [-f] [-period ret_period] (1)password options \ directory1 directory2 msmv [-project proj] [-f] [-period ret_period] (1)password options \ path [path]... Directory Where: (1) [-rpwd read_password] [-wpwd write_password] \ [-newr read_password] [-neww write_password]

File Naming Convention Any file read from or written to the mass store needs to have the prefix mss:/pel/asp2008/ e.g. msrcp bluevista_filename \ mss:/pel/asp2008/model/test_case/horizontal_resolution/mss_filename

Good Practices Mixing LSF and MSS usage Multistep applications See /usr/local/examples/lsf/multistep Asynchronous reads/writes Step 1 - read data from mss (share queue) Step 2 - run model (dycore queue) Step 3 - write data to mss (share queue) Saves GAUs by reducing processor count Pre- and/or Post-processing can follow the same outline as this example

User Support ASP Wiki https://www.wiki.ucar.edu/display/dycores/home CISL homepage: http://www.cisl.ucar.edu/ High End Computing Mass Storage System Data Support Section VisLab Community Data Portal ESMF Extraview Home Page https://cislcustomersupport.ucar.edu/evj/extraview ASP liasion Mike Page 303-944-8291 303-497-2464 mpage@ucar.edu

Questions?