User Services. Chuck Niggley Computer Sciences Corp. NASA Ames Research Center CUG Summit 2002 May 21, CUG Summit 2002, Manchester, England

Similar documents
StorNext SAN on an X1

Mass Storage at the PSC

X1 StorNext SAN. Jim Glidewell Information Technology Services Boeing Shared Services Group

Distributed File Systems Part IV. Hierarchical Mass Storage Systems

Table 9. ASCI Data Storage Requirements

Integrating Fibre Channel Storage Devices into the NCAR MSS

(we call it GMSEC) Dan Smith. Leap Day GSAW 2012 Session 11A Expanding Access to Satellite Information through the Compatible C2 Framework.

Data Management Components for a Research Data Archive

Shared Object-Based Storage and the HPC Data Center

An Early Experience on Job Checkpoint/Restart - Working with SGI Irix OS and the Portable Batch System (PBS) Sherry Chang

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

Backing Up and Restoring Multi-Terabyte Data Sets

Coordinating Parallel HSM in Object-based Cluster Filesystems

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Overview of NASA Networking and IT Research Priorities and Programs. Ken Freeman NASA Research and Education Network

Knowledge-based Grids

Unit 3 Part B. Computer Storage. Computer Technology

Scaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users

A Simple Mass Storage System for the SRB Data Grid

Ohio Supercomputer Center

National Aeronautics and Space Administration Jet Propulsion Laboratory California Institute of Technology WISE Archive.

SAS SOLUTIONS ONDEMAND

NSave Table of Contents

IBM Aspera on Cloud. The Standard Edition of this Cloud Service is available on a subscription basis. It includes:

InQuira Analytics Installation Guide

Sample Policy #1 Documentation Device Control

Service Description Server Patching

DASH COPY GUIDE. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 31

Zimmer CSCI /24/18. CHAPTER 1 Overview. COMPUTER Programmable devices that can store, retrieve, and process data.

Building Self-Healing Mass Storage Arrays. for Large Cluster Systems

Regional & National HPC resources available to UCSB

LHC and LSST Use Cases

Recommended Backup Strategy for FileMaker Pro Server 7/8/9 for Macintosh & Windows Updated February 2008

Upgrading the Fiery CS IC-310 to system software version 2.0

NSave Table of Contents

Back up and Restoration Policy. of VIJETA BROKING INDIA PRIVATE LIMITED

CONTENTS OF THIS REPORT

ARCHITECTURE OF MADIS DATA PROCESSING AND DISTRIBUTION AT FSL

Three Generations of Linux Dr. Alexander Dunaevskiy

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments

Tera Scale Systems and Applications

Dual-System Warm Standby of Remote Sensing Satellite Control System Technology

IBM IoT Connected Vehicle Insights

Experiences in Managing Resources on a Large Origin3000 cluster

Service Description CloudCore

Data Protection for Cisco HyperFlex with Veeam Availability Suite. Solution Overview Cisco Public

Getting Started with XSEDE. Dan Stanzione

Quick Facts About IT Support from Student Affairs IT

SQL Server 2014 Upgrade

Governance, Risk, and Compliance Controls Suite. Hardware and Sizing Recommendations. Software Version 7.2

BaBar Computing: Technologies and Costs

root.smart Power NET Bandwidth Usage CPU Usage

Contains the Following Improvements over Prior Release:

Solutions for iseries

Data Serving Climate Simulation Science at the NASA Center for Climate Simulation (NCCS)

Filesystems on SSCK's HP XC6000

IBM Security SiteProtector System SecureSync Guide

DocuShare 6.6 Customer Expectation Setting

Hiding HSM Systems from the User

16/06/56. Secondary Storage. Secondary Storage. Secondary Storage The McGraw-Hill Companies, Inc. All rights reserved.

The NASA/GSFC Advanced Data Grid: A Prototype for Future Earth Science Ground System Architectures

Transitioning NCAR MSS to HPSS

Globus Online and HPSS. KEK, Tsukuba Japan October 16 20, 2017 Guangwei Che

Solution Pack. Managed Services Virtual Private Cloud Managed Database Service Selections and Prerequisites

IBM TotalStorage Enterprise Storage Server Model 800

ShowCase Image Center Administrator s Guide

IBM C IBM FileNet Content Manager V5.2, Specialist.

ClockWork Enterprise 5

Click the link and follow the simple instructions. Choose a password and you re all set!

Author: Janice M. Anderson Date: 5/1/2006

A M L/2 AML/2. Automated Mixed Media Library

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago

HIGH SPEED CONNECTIVITY BETWEEN AN ID-1 TAPE RECORDER AND HIGH PERFORMANCE COMPUTERS THIC MEETING, JANUARY 22-24, DATATAPE Incorporated

Backup and Recovery for Hadoop: Plans, Survey and User Inputs

Shaking-and-Baking on a Grid

Implementing a Digital Video Archive Based on the Sony PetaSite and XenData Software

DELL POWERVAULT MD FAMILY MODULAR STORAGE THE DELL POWERVAULT MD STORAGE FAMILY

RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University

Best Practices. Ranjit Rai Ranjit Rai 8April 2010

CHAPTER 2: HOW DOES THE COMPUTER REALLY WORK

INSTALL VERITAS BACKUP EXEC v ON WINDOWS SERVER 2019 DOMAIN CONTROLLER

Using Computer Associates BrightStor ARCserve Backup with Microsoft Data Protection Manager

Part 2: Computing and Networking Capacity (for research and instructional activities)

MUG WNY. IBM i Data Resiliency yikes without Tape!

From Beowulf to the HIVE

The World s Fastest Backup Systems

NCAR Computation and Information Systems Laboratory (CISL) Facilities and Support Overview

Scalable I/O, File Systems, and Storage Networks R&D at Los Alamos LA-UR /2005. Gary Grider CCN-9

Hardware RAID, RAID 6, and Windows Storage Server

A Solution for Maintaining File Integrity within an Online Data Archive. Dan Scholes PDS Geosciences Node Washington University

DVS, GPFS and External Lustre at NERSC How It s Working on Hopper. Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011

Chapter 14: File-System Implementation

IBM Case Manager on Cloud

Audit & Advisory Services. IT Disaster Recovery Audit 2015 Report Date January 28, 2015

Windows 2000 Flavors Windows 200 ws 0 Profess 0 P ional Windows 2000 Server Windows 200 ws 0 Advan 0 A ced Server Windows 2000 Datacen ter Server 2

Organizational Update: December 2015

Image Processing on the Cloud. Outline

IBM PureApplication Service

Fujifilm Computer Products Tape Technology Seminar

Transcription:

User Services Chuck Niggley Computer Sciences Corp. NASA Ames Research Center CUG Summit 2002 May 21, 2002 1

NAS User Services History of NAS NASA Advanced Supercomputing Numerical Aerospace Simulations Numerical Aerodynamics Simulations 2

NAS User Services Mission Develop, demonstrate, and deliver innovative,distributed heterogeneous computing capabilities to facilitate NASA projects and missions success. History The NAS division was founded in 1984, with the intent to provide high performance computing capabilities to all NASA centers and their collaborators to enable the USA aeronautic development to improve.the division's staff includes about 80 civil service employees and more than 140 contractor staff 3

NAS User Services Supports Users at other NASA Centers LRC, GRC, GSFC, MSFC, JSC, JPL, DAO Some of our collaborators include: Argonne National Laboratories, DOD MSRC, US Army Corps of Engineers, NSF, NPACI, NCSA, LLNL, Univ. of Glaslow, Mass. Institute of Tech., Stanford Univ., Univ. of Tennessee. Some of the resources available to our user community include: Chapman - A 1024-processor Origin 3000, shared memory single system image Lomax - A 512-processor Origin 3000, shared memory single system image Steger - A 128-processor Origin 2800, shared memory single system image Hopper - A 64-processor Origin 2000, shared memory single system image Bright - A Cray SV1ex with 32 500-megahertz vector processors Lou - A 16 processor Origin 2000 with 3 gigabytes of main memory and 3.3 Terabytes of on-line disk storage Bj kalnay dixon 328 cpu Origin 2000 for DAO work 4

NAS Computer Facility 5

High-End Computing 6

Network Engineering 7

User Services Primary mission: Support Users Help Desk 24 X 7 User Support Service Requests Manage user accounts Reset/Propagate passwords Change File permissions Archive / Restore user data How do I? Where is? Why can/can t I? Why won t my job run? 8

User Services Primary mission: Monitor Systems Help Desk Tools Remedy -- 12802 tickets in 2001 Dlog Operations Log 4211 systems events Scheduling Schema Paging Schema Control Schema 9

User Services Primary mission: Monitor Systems LAMS MOTD Logit NAS System Status System Tools Supportfolio 10

User Services Monitoring Check uptime and load Check file systems Check Critical Processes Job status Problem Support Create REMEDY ticket Contact On Call Engineer DLOG incident systems metrics Logit notify affected users 11

Systems Queue Status http://www.nas.nasa.gov/cgi-bin/nas/status 12

System s Node Map 13

User Services Crash Procedures Restart machine if possible Create REMEDY ticket Contact on call System Administrator Contact Hardware engineer Dlog incident systems metrics Logit notifies affected users Backup Performed daily Bacpac tool used Xfsdump to mass storage 14

DMF Systems Lou SGI 2400, 16p/8GB, 2.5 TB RAID 500 TB media (dual copies) /staff(140gb), /u1(540), /u2(540), /bkups(400), /hsp(260), /chuck(400), /scott(400) 20 M files & directories & ~1600 users Helios1 SGI 2400, 8p/2GB, 1.8 TB RAID 275 TB media (dual copies) /silo4(800gb), /silo5(540), /bkups(270), /chuck(115), /scott(115) 3 M files & directories & ~160 users 15

DMF Systems Lou Transports 10 STK 9840 (5 local, 5 remote), Files - 1MB-100MB 10 STK 9940 (5 local, 5 remote), Files - >100MB 12 STK Redwood (6 local, 6 remote) 13 STK 9490 (12 local, 1 remote) 30 Day Average Transport Writes 737.9 GB/day Transport Reads 79.5 GB/day Files Written 5531/day Files Read 324/day 16

Helios1 Transports DMF Systems 10 STK 9840 (5 local, 5 remote), Files 1MB- 100MB 10 STK 9940 (5 local, 5 remote), Files >100MB 30 Day Average Transport Writes - 691.2 GB/day Transport Reads - 100.5 GB/day Files Written 6234/day Files Read - 900/day 17

File & Data Profile LOU File sizes less than 30 MB are 97% of the # files and total 10% of the data. File sizes less than 1 MB are 86% of the # files and total 1.2 TB. Helios1 File sizes less than 50 MB are 90% of the # files and total 10% of the data. File sizes less than 1 MB are 53% of the # files and total 186 GB. 18

Tape Silo 19

Further Contact http://www.nas.nasa.gov support@nas.nasa.gov 1800 331 USER (8737) 650 604 4444 20

Special Thanks Chris Hamilton John Pandori Francois Montoya Ryan Coburn Gabe Wedekind 21