Storage Supporting DOE Science

Similar documents
DVS, GPFS and External Lustre at NERSC How It s Working on Hopper. Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011

GPFS on a Cray XT. Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009

Tuning I/O Performance for Data Intensive Computing. Nicholas J. Wright. lbl.gov

GPFS for Life Sciences at NERSC

Parallel File Systems. John White Lawrence Berkeley National Lab

Data Transfer Study for HPSS Archiving

IBM ProtecTIER and Netbackup OpenStorage (OST)

ALICE Grid Activities in US

Execution Models for the Exascale Era

NCAR Globally Accessible Data Environment (GLADE) Updated: 15 Feb 2017

NERSC. National Energy Research Scientific Computing Center

GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations

A GPFS Primer October 2005

IBM řešení pro větší efektivitu ve správě dat - Store more with less

Table 9. ASCI Data Storage Requirements

Benoit DELAUNAY Benoit DELAUNAY 1

Scaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

An Introduction to GPFS

Data storage services at KEK/CRC -- status and plan

IBM IBM Open Systems Storage Solutions Version 4. Download Full Version :

SC17 - Overview

The GUPFS Project at NERSC Greg Butler, Rei Lee, Michael Welcome. NERSC Lawrence Berkeley National Laboratory One Cyclotron Road Berkeley CA USA

ARCHER/RDF Overview. How do they fit together? Andy Turner, EPCC

The Blue Water s File/Archive System. Data Management Challenges Michelle Butler

IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps://

Data Movement & Tiering with DMF 7

TS7700 Technical Update TS7720 Tape Attach Deep Dive

Backup and archiving need not to create headaches new pain relievers are around

Providing a first class, enterprise-level, backup and archive service for Oxford University

Data Deduplication Makes It Practical to Replicate Your Tape Data for Disaster Recovery

Outline. March 5, 2012 CIRMMT - McGill University 2

WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION?

IBM Storwize V7000 Unified

Exam Name: Midrange Storage Technical Support V2

Technology Insight Series

IBM Corporation IBM Corporation

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM

HPSS Treefrog Summary MARCH 1, 2018

GPFS on a Cray XT. 1 Introduction. 2 GPFS Overview

Exam : Title : Storage Sales V2. Version : Demo

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

HPSS for Spectrum Scale

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

Isilon Scale Out NAS. Morten Petersen, Senior Systems Engineer, Isilon Division

Storage Optimization with Oracle Database 11g

Exam : Title : High-End Disk for Open Systems V2. Version : DEMO

HPC Storage Use Cases & Future Trends

FOUR WAYS TO LOWER THE COST OF REPLICATION

How to solve your backup problems with HP StoreOnce

IBM Tivoli Storage Manager for Windows Version Installation Guide IBM

Shared Object-Based Storage and the HPC Data Center

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Cori (2016) and Beyond Ensuring NERSC Users Stay Productive

Implementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment

CISL Update. 29 April Operations and Services Division

An introduction to GPFS Version 3.3

High Performance Storage Solutions

Hitachi Adaptable Modular Storage and Hitachi Workgroup Modular Storage

Virtualizing disaster recovery helps ensure business resiliency while cutting operating costs.

Exam : Title : High-End Disk Solutions for Open Systems Version 1. Version : DEMO

Hitachi Adaptable Modular Storage and Workgroup Modular Storage

Integrating Fibre Channel Storage Devices into the NCAR MSS

LUG 2012 From Lustre 2.1 to Lustre HSM IFERC (Rokkasho, Japan)

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

Lustre architecture for Riccardo Veraldi for the LCLS IT Team

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber

Emerging Technologies for HPC Storage

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

EMC CLARiiON Backup Storage Solutions

IBM Storage Software Strategy

<Insert Picture Here> Tape Technologies April 4, 2011

IBM Tivoli Storage Manager V5.1 Implementation Exam # Sample Test

IBM IBM Entry Level High Volume Storage and Sales.

Data Protection Modernization: Meeting the Challenges of a Changing IT Landscape

RAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System

Vendor: IBM. Exam Code: C Exam Name: Fundamentals of Applying Tivoli Storage Solutions V3. Version: Demo

Distributed File Systems Part IV. Hierarchical Mass Storage Systems

Achieving Rapid Data Recovery for IBM AIX Environments An Executive Overview of EchoStream for AIX

IBM Magstar 3494 Model B18 Virtual Tape Server Features Enhance Interoperability and Functionality

SXL-8500 Series: LTO Digital Video Archives

EMC Backup and Recovery for Microsoft Exchange 2007

SXL-6500 Series: LTO Digital Video Archives

Pushing the Limits. ADSM Symposium Sheelagh Treweek September 1999 Oxford University Computing Services 1

HP Supporting the HP ProLiant Storage Server Product Family.

TS7700 Technical Update What s that I hear about R3.2?

Magellan Project. Jeff Broughton NERSC Systems Department Head October 7, 2009

EMC Celerra CNS with CLARiiON Storage

Protect enterprise data, achieve long-term data retention

Chapter 7. GridStor Technology. Adding Data Paths. Data Paths for Global Deduplication. Data Path Properties

Preserving the World s Most Important Data. Yours. SYSTEMS AT-A-GLANCE: KEY FEATURES AND BENEFITS

IBM Power Systems Update. David Spurway IBM Power Systems Product Manager STG, UK and Ireland

Chapter 4 Data Movement Process

Storage Technology Requirements of the NCAR Mass Storage System

The storage challenges of virtualized environments

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.

What's in this guide... 4 Documents related to NetBackup in highly available environments... 5

Transcription:

Storage Supporting DOE Science Jason Hick jhick@lbl.gov NERSC LBNL http://www.nersc.gov/nusers/systems/hpss/ http://www.nersc.gov/nusers/systems/ngf/ May 12, 2011

The Production Facility for DOE Office of Science Operated by UC for the DOE NERSC serves a large population Approximately 4000 users, 400 projects, 500 codes Focus on unique resources High-end computing systems High-end storage systems Large shared GPFS (a.k.a. NGF) Large archive (a.k.a. HPSS) Interface to high speed networking ESnet soon to be 100Gb (a.k.a. ANI) Our mission is to accelerate the pace of discovery by providing high performance computing, data, and communication services to the DOE Office of Science community. 2 2010 storage usage by area of science. Climate, Applied Math, Astrophysics, and Nuclear Physics are 75% of total.

NERSC is a net importer of data. Storage Systems Group: Focused on Data Needs Since 2007, we receive more bytes than transfer out to other sites. NERSC is a net importer of data. Leadership in inter-site data transfers (Data Transfer Working Group). We present efficient center-wide storage solutions. NGF aids in minimizing multiple online copies of data, and reduces extraneous data movement between systems. HPSS enables exponential user data growth without exponential impact to the facility. Partnering with ANL and ESnet to advance HPC network capabilities. Magellan ANL and NERSC cloud research infrastructure. Advanced Networking Initiative with ESnet (100Gb Ethernet). 3

Storage, Who We Are Wayne Hurlbert and Nick Balthaser: HPSS system analysts Damian Hazen and Mike Welcome: HPSS developers Matt Andrews: GPFS backup developer Will Baird: Data transfer system analyst Rei Lee and Greg Butler: GPFS system analysts 4

Center-wide File Systems /project is for sharing and long-term residence of data on all NERSC computational systems. 8% monthly growth, ~100% growth per year Not purged, quota enforced (1TB default per project), backed up daily Serves 200 projects over FC4/8 1.6PB total capacity ~5TB average daily IO /global/homes provides a common login environment for users across systems. 15% monthly growth Not purged but archived, quota enforced (40GB per user), backed up daily Serves 4000 users, 400 per day over Ethernet 50TB total capacity 100 s of GBs average daily IO /global/scratch provides high bandwidth and capacity data across systems. Purged, quota enforced (40TB per user), not backed up Serves 4000 users over FC8 1PB total capacity 5

NGF Global Scratch Franklin Hopper FC (8x4 FC8) Gem* DTNs FC (2x4 FC4) /global/scratch 920 TB (~16 GB/s) pnsd IB DVS SGNs GPFS Server No increase planned FY11 FC (12x2 FC8) Euclid PDSF PDSF planck pnsd IB Carver pnsd IB Magellan IB Dirac 6

NGF Global Homes Franklin Hopper DTNs /global/homes 50 TB Increase planned FY12 FC Euclid SGNs PDSF Carver GPFS Server Ethernet (4x1 10Gb) Magellan Dirac 7

NGF Project Franklin Sea* FC (8x4 FC8) Hopper Gem* DVS FC (20x2 FC4) FC (2x4 FC4) /project 1.6 PB (~10 GB/s) pnsd IB DVS DTNs SGNs PDSF Ethernet (2 10Gb) GPFS Server IB PDSF planck FC (4x2 FC4) pnsd Carver No increase planned FY11 pnsd FC (12x2 FC8) IB Magellan IB Euclid Dirac 8

Archival Storage User HPSS Single transfers 1GB/sec read/write Aggregate bandwidth 4+GB/sec Average daily IO of 20TB, with peak at 40TB 200TB disk cache 24 9840D, 48 T10KB, 16 T10KC tape drives Largest file: 5.5TB Oldest file: Jan 1976 Backup HPSS Single transfers 1GB/sec read/write Aggregrate bandwidth 3+GB/sec Average daily IO of 10TB, with peak at 130TB 40TB disk cache 8 9840D and 18 T10KB tape drives Largest file: 3.5TB Oldest file: May 1995 9

User HPSS Configuration Presents a HPSS key to use for auto-login IBM p55a AIX HPSS 6.2 auth LAN HSI, HTAR, pftp, ftp, gridftp Client Cluster Client Cluster Client Cluster 9 AIX movers 2 x p750 4 x p55a 4 x 6M2 1 x 6M1 HPSS Core Server Cisco 9513 Switch HPSS Movers IBM DS4300 Tape 4 x SL8500s 48 x T10KB & 24 x 9840D 16 T10KC 215TB 3ea 9550 3ea 9900 DDN 9550/9900 10 10

Backup HPSS Configuration Presents a HPSS key to use for auto-login IBM p55a AIX HPSS 6.2 auth LAN HSI, HTAR, pftp, ftp, gridftp Client Cluster Client Cluster Client Cluster 5 AIX movers 4 x p55a 1 x 6M1 HPSS Core Server Cisco 9513 Switch HPSS Movers IBM DS4300 Tape 4 x SL8500s 18 x T10KB & 8 x 9840D 68TB 3ea 9550 3ea 9900 DDN 9550/990 0 11 11

Tape with High Frequency Read Monitoring in place to understand peak and average load Custom database for gathering usage statistics per transfer bandwidth, concurrent transfers, commonly requested tapes Crossroads RVA tape drive utilization, occupancy, concurrency Budget tape drives to peak demand We had an average of 16 drives in use a few months ago Our peak demand was 30 drives Size the HSM disk cache properly for repeat reads Analyze the cache hit ratio Plan to hold at least 5 days of peak data transfer on disk Study the stage requests (tape to disk) Identify and resolve library hotspots Ensures we have cartridge closest to drives for quickest mount time Only allow select applications to do direct-to-tape IO Our parallel file system backups and restore (10GB files of 600TB) 12

Storage Challenges & Goals Move storage policies from the group/sysadmins to users Consistent with the success of turning quotas into user allocations! GPFS File System backups, significant effort to scale to PB, and are only intended for disaster recovery. These probably have more value to the user, and they can help determine critical data. They can help reduce the scale of the solution. Provide tools to aid the user in data management (archiving their file system data with one-click, backing up their data with one-click). Balancing growth across disk and tape File system growth should not exceed archive growth, otherwise we have two problems (backups and resource imbalance) Ensuring our storage network can meet the demands of Exascale computing/storage. 40Gb/100Gb Ethernet demonstrations, Cloud storage for DOE? Fiber Channel deployment improvements, retracting our SAN InfiniBand testing as center-wide storage network Consider SAS connections for storage to servers 13

Migrating Data from Old to New 40,489 cartridges to migrate on 7/1/2009 18,058 9840A 15,572 9940B 6,859 T10KA T10KA 9940B 9840D T10KB 9840A 14

Actual Reliability of Data on Tape We read all data on 40,489 tapes 6,859 T10KA (up to 2yrs old) 15,572 9940B (up to 8yrs old) 18,058 9840A (up to 12yrs old) We found 36 tapes that had some data that couldn t be read. 24 9840A, 8 9940B, 4 T10KA One of those had 558 files and couldn t be mounted. Two others had 136 and 43 files that couldn t read, the remainder had less than 6 with most having only 1. 0.0009% error rate for tape cartridges or 99.9991% with 100% readable data. But wait! It s not the whole cartridge, the unreadable data was contained in 850 files (84.6 M total) representing 3.0 TB of data (8,056 TB total). 0.00001% error rate for files or 99.99999% of files with 100% readable data. Unreadable data is normally in one or two blocks of data (250-500MB of data) with remainder of file readable, but we don t recover partial files unless user requests. 15

Our Experience Shows Enterprise Tape is Reliable The data migration (7/09 12/10) involved reading 22,065,763m of tape, the distance of flying San Francisco to Tokyo to Paris to Nova Scotia. 16 Unreadable data resided in at least one block of 850 files. These files represent 178m of tape, approximately the length of two Boeing 777 jets (70m) or half the length of most cruise ships (350m).