Storage Challenges at the San Diego Supercomputer Center

Similar documents
A Simple Mass Storage System for the SRB Data Grid

Knowledge-based Grids

Data Grid Services: The Storage Resource Broker. Andrew A. Chien CSE 225, Spring 2004 May 26, Administrivia

DATA MANAGEMENT SYSTEMS FOR SCIENTIFIC APPLICATIONS

SRB Logical Structure

Collection-Based Persistent Digital Archives - Part 1

Introduction to The Storage Resource Broker

Mitigating Risk of Data Loss in Preservation Environments

Distributed Data Management with Storage Resource Broker in the UK

IRODS: the Integrated Rule- Oriented Data-Management System

Table 9. ASCI Data Storage Requirements

Scaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users

Oct. 13, Reagan Moore

Leveraging High Performance Computing Infrastructure for Trusted Digital Preservation

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade

Storage Supporting DOE Science

DSpace Fedora. Eprints Greenstone. Handle System

Data Sharing with Storage Resource Broker Enabling Collaboration in Complex Distributed Environments. White Paper

Distributed File Systems Part IV. Hierarchical Mass Storage Systems

The NASA/GSFC Advanced Data Grid: A Prototype for Future Earth Science Ground System Architectures

Transcontinental Persistent Archive Prototype

ALICE Grid Activities in US

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

The Pacific Research Platform (PRP)

The International Journal of Digital Curation Issue 1, Volume

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *

DataCutter Joel Saltz Alan Sussman Tahsin Kurc University of Maryland, College Park and Johns Hopkins Medical Institutions

Introduction to Grid Computing

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment

Integrating Fibre Channel Storage Devices into the NCAR MSS

High Performance Computing Course Notes Grid Computing I

Policy Based Distributed Data Management Systems

SDS: A Scalable Data Services System in Data Grid

Managing Petabytes of data with irods. Jean-Yves Nief CC-IN2P3 France

NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions

The Grid Architecture

SAN, HPSS, Sam-QFS, and GPFS technology in use at SDSC

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center

Implementing Trusted Digital Repositories

Data Movement & Tiering with DMF 7

HIGH SPEED CONNECTIVITY BETWEEN AN ID-1 TAPE RECORDER AND HIGH PERFORMANCE COMPUTERS THIC MEETING, JANUARY 22-24, DATATAPE Incorporated

SC17 - Overview

Mass Storage at the PSC

Clemson HPC and Cloud Computing

Scientific Workflow Tools. Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego

Advanced School in High Performance and GRID Computing November Introduction to Grid computing.

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

MetaData Management Control of Distributed Digital Objects using irods. Venkata Raviteja Vutukuri

Distributing BaBar Data using the Storage Resource Broker (SRB)

irods usage at CC-IN2P3 Jean-Yves Nief

Universities Access IBM/Google Cloud Compute Cluster for NSF-Funded Research

CC-IN2P3 activity. irods in production: irods developpements in Lyon: SRB to irods migration. Hardware setup. Usage. Prospects.

Basics of the High Performance Storage System

Engagement With Scientific Facilities

Climate Data Management using Globus

Implementation of the Pacific Research Platform over Pacific Wave

Shaking-and-Baking on a Grid

T-Systems Solutions for Research. Data Management and Security. T-Systems Solutions for Research GmbH

The Data exacell DXC. J. Ray Scott DXC PI May 17, 2016

LHC and LSST Use Cases

A Quick guide to AFS Simon Wilkinson school of Informatics University of Edinburgh

Indiana University s Lustre WAN: The TeraGrid and Beyond

Wide-Area Networking at SLAC. Warren Matthews and Les Cottrell (SCS Network Group) Presented at SLAC, April

EGEE and Interoperation

Data Management Components for a Research Data Archive

Data Movement & Storage Using the Data Capacitor Filesystem

Overcoming Obstacles to Petabyte Archives

Deploying the TeraGrid PKI

Storage Resource Sharing with CASTOR.

WInSAR Operations Report

Data Archiving with the SRB* Jinghua Zhou WUCS May 2000

MOHA: Many-Task Computing Framework on Hadoop

Building a Reference Implementation for Long-Term Preservation

White Paper: National Data Infrastructure for Earth System Science

THE IMPACT OF E-COMMERCE ON DEVELOPING A COURSE IN OPERATING SYSTEMS: AN INTERPRETIVE STUDY

Programmable Information Highway (with no Traffic Jams)

Benoit DELAUNAY Benoit DELAUNAY 1

Replica Selection in the Globus Data Grid

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill


Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System

Storage Technology Requirements of the NCAR Mass Storage System

irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam

The ASCI/DOD Scalable I/O History and Strategy Run Time Systems and Scalable I/O Team Gary Grider CCN-8 Los Alamos National Laboratory LAUR

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM

Layered Architecture

1 PERFORMANCE ANALYSIS OF SUPERCOMPUTING ENVIRONMENTS. Department of Computer Science, University of Illinois at Urbana-Champaign

and the GridKa mass storage system Jos van Wezel / GridKa

Science-as-a-Service

Big Data 2015: Sponsor and Participants Research Event ""

The Fusion Distributed File System

The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets

Data Grids, Digital Libraries, and Persistent Archives

Providing a first class, enterprise-level, backup and archive service for Oxford University

Grid Computing Middleware. Definitions & functions Middleware components Globus glite

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

HPC Capabilities at Research Intensive Universities

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Enabling a SuperFacility with Software Defined Networking

Transcription:

Storage Challenges at the San Diego Supercomputer Center Richard Marciano P.O. Box 85608 San Diego, CA 92186 Ph: (619) 534-8345 Fax: (619 822-0906 E-mail: marciano@sdsc.edu Presented at the THIC meeting at the Amberley Suite Hotel in Albuquerque, NM on April 21, 1998

Data Intensive & Mass Storage Systems Group Phil Andrews Systems Storage Group Joe Lopez Mike Gleicher Data Intensive Group Reagan Moore Chaitan Baru Amarnath Gupta Richard Marciano Arcot Rajasekar Wayne Schroeder Michael Wan

NPACI Program WHO: 37 partners in 18 states led by UCSD resource / research & education / associate / industrial partners WHAT: 1. Create a HPC infrastructure with teraflops performance, petabyte archives, and discipline-specific caches 2. Increase focus on data-intensive computing to analyze terabyte data sets 3. Education, outreach, and training 4. Promote industrial partnerships

Partner Sites UCSD/SDSC Caltech U Texas UC Berkeley U Michigan UC Davis UCLA UC Santa Barbara U Houston/Keck Center U Maryland Washington U CRPC CSU/SDSU UC Santa Cruz Oregon State Scripps U Virginia U Wisconsin Montana State USC U Tennessee U Kansas LTER / U New Mexico Rutgers Salk Institute Stanford CARB EPSCoR Foundation JPL Kitt Peak Nat. Obs. Los Alamos NL LBNL / NERSC Lawrence Livermore NL PNNL / EMSL UC Irvine UC San Francisco U Massachusetts U Pennsylvania

Planned NPACI Data Caches (FY 98) Partner Site Cache Size (GB) Hardware (OS) Network UCSD/SDSC 2000 IBM SP, Wildcat (AIX) vbns, CalREN-2, Esnet, AAInet Caltech 500 IBM SP (AIX) vbns UC Berkeley 50 Sun (Solaris) CalREN-2 U Michigan 200 IBM SP (AIX) vbns U Texas 100 IBM SP (AIX) vbns UC Davis 250 Origin 2000 (SGI) CalREN-2 UC Los Angeles 200 SGI (SGI) CalREN-2 UC Santa Barbara 100 Digital (OSF) CalREN-2 Washington U 400 Sun (SunOS) vbns U Maryland 200 IBM SP (AIX) vbns U Houston 200 IBM SP (AIX) vbns

Data Storage Vision Dramatically improve the ability of science researchers to build upon prior knowledge by: establishing domain specific data repositories enabling publication of scientific data sets and associated methods providing information discovery and data handling APIs Support integration of digital library, metacomputing, scalable tools, interaction environments

Data Cache Testbed Projects Molecular Science U Houston (molecular dynamics trajectory DB), U Houston Keck Center for Computational Biology (enhanced molecule image DB) Neuroscience Sharing brain images between data repositories at Washington U, UCLA, UCSD/SDSC Earth System Science UCSD (climate DB), UCLA (earth sys. DB), UC Davis (California natural resources), U Maryland (satellite land cover data repository) Astronomy Integration of multiple digital sky surveys to support statistical analysis on astronomical objects, Caltech (continued on next slide )

Astronomy ( continued) Federate multi-frequency surveys 2 million panels for the whole sky 15 TB integrated catalog and image DB Computationally intensive analysis: what is the morphology of galaxies Digital Palomar Observatory Sky Survey optical: 2 billion source catalog / 3 TB 2-Micron All Sky Survey infrared: 100 million source catalog / 10 TB Sloan Digital Sky Survey radio: NVSS (2 million sources) + FIRST (1 million)

Software Architectures Computational Grid Distributed execution of applications across heterogeneous compute platforms linked by the Web InterLib Uniform information discovery and data publication environment integrating access to heterogeneous data resources on the Web

Storage Environment HPSS 2 IBM 3494 Magstar Tape Libraries which hold 2,000 tapes each, with 14 3590 tape drives IBM Scalable POWERparallel System (SP2), 23 nodes 90 HPSS servers (32 movers), 13 classes of service based on 23 storage classes, automatic CoS selection Current Storage 50 TB, 65 TB with Cornell & Pittsburgh STK 11 TB IBM RS6000/590 IBM SSA disk subsystem, 1 TB 260 GB Maximum Strategy disk

SDSC HPSS Production System 4 Million files at present 50 TB of data Movers average up to 4,000 to 5,000 requests Peaks at 10,000 20 concurrent users on average 700 tape mounts / day on average Transfer rates: 8 MB / s over networks, 1 TB / day in & out of HPSS 2 TB / month growth, 25 TB burst from PSC & CTC

CTC HPSS System 13 TB from CTC on RS/6000 (2 x 2,500 tapes) 3 GB of HPSS metadata received over the vbns bandwidth of > 120 MB / s 1200 3590 tapes & 1200 3480 tapes 2-3480 manual tape drives & access to 4-3590 tape drives in STK Library (retrofit) Migrating CTC user data into SDSC production system as a background job over within the year CTC users access data in Read Only mode

PSC Data Migration All data over vbns (OC-3 link at 155 Mb / s), 10 MB / s Transfers on a voluntary basis Transfers directly into HPSS tape Fully automated process About 12 TB to transfer total (5 more likely)

HPSS User Interfaces FTP/PFTP Passwordless acess Automatic COS selection HSI (DCE and non-dce) DB2 Universal Database Storage Resource Broker (SRB) http://www.npaci.edu/dice

Software Architecture of the SRB SRB Client Privileged APIs Object SID SRB MCAT UniTree HPSS DB2 Oracle Unix

The Storage Resource Broker is Middleware Application (SRB client) MCAT SRB Server DB2, Oracle, Illustra, ObjectStore HPSS, UniTree UNIX, ftp Distributed Storage Resources (database systems, archival storage systems, file systems, ftp)

The SRB Process Model Application SRB Master (Host, port) (port) SRB agents MCAT

Federated SRB Operation Application SRB server 1 6 3 SRB server 4 SRB agent 5 SRB agent 2 MCAT

SRB Space DR DR DR DL SRB SRB SRB CP MC CP SRB SRB DL DL SRB CP CP CP CP SRB CP CP DR CP DR SRB MC DL SRB SRB DR DR - Data Repository DL - Dig Library MC - Meta Catalog CP - Comp Process/ SRB Client

SRB V1.0 Features Multi-platform (clients and servers) SunOS/Solaris, AIX, Cray C90, DEC OSF API and command line interfaces Low-level and high-level APIs Storage systems supported DB2, Illustra, Unitree, HPSS, UNIX Support for federated servers Released early September, 1997

SRB V1.1 Features In beta in DOCT. Released in January, 1998 Ported to additional platforms - SGI, Cray T3E Incorporates the SDSC Encryption and Authentication (SEA) Library Ticket-based access control Graphical user interface - SRBTool Additional storage systems supported Oracle, Objectstore, ftp, http Oracle-based MCAT Support for proxy operations, e.g. move, copy, replicate Data replication using Logical Storage Resource

MCAT: Metadata Catalog Stores metadata about Users, Data sets, Resources, Methods Provides collection abstraction Stores detailed access control information Maintains audit trail information on data sets Implemented as a relational database with referential integrity constraints (currently uses DB2, ported to Oracle)

SRB API Programmatic API High-level API Low-level API SRB Manager API Command Level Interface - Scommands Graphical User Interface - SRBTool Web Utilities

SRB API Interface Application MCAT SRB Master

High & Low-level API Low-level API talks to resource drivers no registration of data sets in MCAT no authentication through MCAT User provides all information High-level API Uses low-level API to access resources Registers data management information in MCAT Uses MCAT for authentication and meta information Uses MCAT for resource and data discovery Access/store data in remote SRB

SRB Graphical Interface The SrbTool is a prototype. It is a proof-of-concept demonstration for the Uniform Visualization Architecture. It incorporates designs from the Interaction Infrastructure and the Data Arbitration System. URL : http://www.sdsc.edu/~moreland/projects/srbtool/doc/srbtool.html USER GUI Invoke SRB operations SRB Agent Obtain user s metadata information via SRB. Display collections and resources accessible by user MCAT Proxy operation

SRB Command Line Interface USER Environment File SRB shell commands: Sls, Scp, Scat, Sput, Sget,... SRB Agent MCAT Proxy operation

SRB Data Replication Support Replication via Resource Set definition Replication support integrated into write function srbobjreplicate API can be used for post facto replication Synchronous replication across all sites. Can choose any k out of n Can choose specific replica on read operation

Data Replication (DOCT) Application SAIC NWS NCSA SRB SDSC SRB MCAT SRB Caltech LogRsrc1 LogRsrc2 Oracle HPSS DB2 Unix HPSS

SEA (SDSC Encryption & Authentication) Developed as part of DOCT Designed for Supercomputing/ MetaComputing Environment Based On RSA Public/Private Keys and RC5 Encryption Algorithm Integrated into SRB Being integrated into pftp & hsi - for Remote HPSS Access

SEA Features Secure User/Process Authentication Across Network (TCP Sockets) Optional Encryption As Independent Function Simple API Batch Support - Long-term Certificates Adjustable Key Lengths (Speed/Security Tradeoff) User-Adjustable Encryption Levels (Speed/Security Tradeoff) Multiple Initial User Registration Methods (Set By Administrator) Self-Introduction Trusted Host Password Available for Cray T90, C90, T3E, SunOS, Solaris, IRIX, OSF1, AIX, CS6400, NextStep More Information: http://www.npaci.edu/dice

SRB Case Studies Digital Libraries ELIB - Berkeley Digital Library (UCB) ADL - Alexandria Digital Library (UCSB) DOCT - Patent Workflow System Environmental Archives International Satellite Cloud Climatology Project Data TIES Data Atlas - Chesapeake Bay Estuarine Studies

Digital Libraries Access to images, documents & Tools Large Number of files - Images of various resolution Documents of various types (valences) Web-based access - form and spatial queries Domain Metadata - External DB Digital Objects replicated Uses SRB web interface and low-level API

Digital Libraries and SRB ELIB ADL SRB

DOCT - Patent Workflow Archiving Applications and Office-Actions Replicated Archiving Storage of Issued Patents in multiple forms - SGML, DB Schema, HTML Access of Patents from replicated storage Controlled Access for Applications and Office-Actions Uses SRB web utilities and high-level API URL: http://www.sdsc.edu/doct

TIES Distributed Data Atlas for Cruise-Transects Data collected at Chesapeake Bay Estuarine Studies 26 transects, 3 times/year, 6 variables Reference atlas of 2-D color plots Domain metadata stored in Oracle Each Object registered in MCAT 3 partner sites - replication and staging in hierarchical storage (Unix and HPSS) Uses Scommands and srb web utilities URL: http://www.sdsc.edu/~marciano/doct/atlas/doct.html

Global Clouds Database Storing cloud information throughout the world Tabular data - made online through SRB 6,596 grid-cells over the globe 200 variables per grid-cell Data collected every 3 hours over 4 years (89-92) Small metadata - stored in Flat file Each cell dataset (4 yrs data) is stored in SRB (HPSS) Uses Scommands & srb web utilities URL: http://www.sdsc.edu/~marciano/clouds/clouds.htm

Research Activities Data Handling Infrastructure Parallel I/O Technology - ANL Active Data Repository - UMd Digital Library Infrastructure InterLib - Stanford, UCB, UCSB, CDL & others Collaborations GDE Systems NASA ASCII NLANR

Summary Storing, Publishing, Sharing & Cooperating Distributed, Replicated, Heterogeneous Data Cache Unified access to Archival Storage, Database Storage, Disk Storage Information Discovery (application-level metadata) unifying meta-catalogs (future work) Secure, encrypted controlled access & data movement integrate with other security systems (future work) Scalability and performance modeling needed (see next slide )

Capacity Planning for MSS Workload characterization trace analysis, get/put log files, resource monitoring, clustering Baseline model creation analytical models simulation models others (prototype, numerical/statistical, hybrid) Prediction model creation Examples: Analytical modeling: Queuing Models of Tertiary Storage (Ted Johnson) Analytical Performance of Hierarchical MSS (Daniel Menasce et al.)