Experiences using a multi-tiered GPFS file system at Mount Sinai. Bhupender Thakur Patricia Kovatch Francesca Tartagliogne Dansha Jiang

Similar documents
Minerva User Group 2018

An ESS implementation in a Tier 1 HPC Centre

Data Movement & Tiering with DMF 7

IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage

Parallel File Systems. John White Lawrence Berkeley National Lab

The Software Defined Online Storage System at the GridKa WLCG Tier-1 Center

AFM Migration: The Road To Perdition

Guillimin HPC Users Meeting February 11, McGill University / Calcul Québec / Compute Canada Montréal, QC Canada

RESEARCH DATA DEPOT AT PURDUE UNIVERSITY

Data storage services at KEK/CRC -- status and plan

Extraordinary HPC file system solutions at KIT

GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations

Data center requirements

Storage for HPC, HPDA and Machine Learning (ML)

New Storage Technologies First Impressions: SanDisk IF150 & Intel Omni-Path. Brian Marshall GPFS UG - SC16 November 13, 2016

DVS, GPFS and External Lustre at NERSC How It s Working on Hopper. Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011

Ende-zu-Ende Datensicherungs-Architektur bei einem Pharmaunternehmen

ROBINHOOD POLICY ENGINE

NCAR Globally Accessible Data Environment (GLADE) Updated: 15 Feb 2017

Challenges in making Lustre systems reliable

Lustre usages and experiences

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

An introduction to GPFS Version 3.3

IBM Storwize V7000 Unified

Is it a good idea to create an HSM without tapes?

Ready-to-use Virtual Appliance for Hands-on Spectrum Archive Evaluation

LUG 2012 From Lustre 2.1 to Lustre HSM IFERC (Rokkasho, Japan)

Enterprise2014. GPFS with Flash840 on PureFlex and Power8 (AIX & Linux)

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

IBM Spectrum Scale on Power Linux tuning paper

DDN s Vision for the Future of Lustre LUG2015 Robert Triendl

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

Linux Clusters Institute: Spectrum Scale

irods usage at CC-IN2P3: a long history

Cluster-Level Google How we use Colossus to improve storage efficiency

ONTAP 9 Cluster Administration. Course outline. Authorised Vendor e-learning. Guaranteed To Run. DR Digital Learning. Module 1: ONTAP Overview

Video Surveillance EMC Storage with LENSEC Perspective VMS

IBM EXAM QUESTIONS & ANSWERS

TGCC OVERVIEW. 13 février 2014 CEA 10 AVRIL 2012 PAGE 1

Data Movement and Storage. 04/07/09 1

PracticeTorrent. Latest study torrent with verified answers will facilitate your actual test

IBM Spectrum Scale Strategy Days

AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT

Integrating Fibre Channel Storage Devices into the NCAR MSS

Storage validation at GoDaddy Best practices from the world s #1 web hosting provider

HPC Storage Use Cases & Future Trends

Outline: ONTAP 9 Cluster Administration and Data Protection Bundle (CDOTDP9)

Question No: 1 Which tool should a sales person use to find the CAPEX and OPEX cost of an IBM FlashSystem V9000 compared to other flash vendors?

Benoit DELAUNAY Benoit DELAUNAY 1

CMS Grid Computing at TAMU Performance, Monitoring and Current Status of the Brazos Cluster

Hitachi HH Hitachi Data Systems Storage Architect-Hitachi NAS Platform.

Fast Forward I/O & Storage

Surveillance Dell EMC Storage with Verint Nextiva

HPE MSA 2042 Storage. Data sheet

The Btrfs Filesystem. Chris Mason

IBM Exam C IBM Storwize V7000 Implementation V1 Version: 6.0 [ Total Questions: 78 ]

IT Optimization Trends. Summary Results January 2018

Configuring IBM Spectrum Protect for IBM Spectrum Scale Active File Management

Supercomputing at the United States National Weather Service (NWS)

RUNNING PETABYTE-SIZED CLUSTERS

W4118 Operating Systems. Instructor: Junfeng Yang

and the GridKa mass storage system Jos van Wezel / GridKa

Lessons learned from Lustre file system operation

Exam : Title : High-End Disk for Open Systems V2. Version : DEMO

Storage Supporting DOE Science

Managing and Accessing 500 Million Files (and More!)

LCE: Lustre at CEA. Stéphane Thiell CEA/DAM

Cisco UCS Director Tech Module IBM Storage Arrays. June 2016

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

Analyzing the High Performance Parallel I/O on LRZ HPC systems. Sandra Méndez. HPC Group, LRZ. June 23, 2016

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

irods at TACC: Secure Infrastructure for Open Science Chris Jordan

GPFS on a Cray XT. Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009

IBM ProtecTIER and Netbackup OpenStorage (OST)

Project Quota for Lustre

Dell EMC Storage with the Avigilon Control Center System

Quickstart installation and setup guide. P5 Desktop Edition runs on a single host computer and

A Robust, Flexible Platform for Expanding Your Storage without Limits

Move Exchange 2010 Database To New Drive Dag

IBM Spectrum Protect Node Replication

EMC E Exam. Volume: 122 Questions

IBM Spectrum Scale in an OpenStack Environment

Was ist dran an einer spezialisierten Data Warehousing platform?

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Chapter 2 CommVault Data Management Concepts

Next Generation of Data Recovery with TSM 6

Backup and archiving need not to create headaches new pain relievers are around

Shared Object-Based Storage and the HPC Data Center

Backups and archives: What s the scoop?

SharePoint 2016 Administrator's Survival Camp

Oracle EXAM - 1Z Oracle Exadata Database Machine Administration, Software Release 11.x Exam. Buy Full Product

Surveillance Dell EMC Storage with Synectics Digital Recording System

Pass IBM C Exam

Exam : Title : High-End Disk Solutions for Open Systems Version 1. Version : DEMO

IBM Spectrum Archive Solution

Parallel File Systems Compared

Name: Instructions. Problem 1 : Short answer. [48 points] CMU / Storage Systems 20 April 2011 Spring 2011 Exam 2

Surveillance Dell EMC Storage with LENSEC Perspective VMS

Insights into TSM/HSM for UNIX and Windows

GPFS for Life Sciences at NERSC

Transcription:

Experiences using a multi-tiered GPFS file system at Mount Sinai Bhupender Thakur Patricia Kovatch Francesca Tartagliogne Dansha Jiang

Outline 1. Storage summary 2. Planning and Migration 3. Challenges 4. Conclusions/Feedback

Storage

Storage Spectrum Scale User Group 4

Work-flow: Core-hours by department 61,206,676 core-hours used by 28,672,552 jobs from 89 departments Spectrum Scale User Group 5

Work-flow: GPFS Project storage (/sc/orga/) 5.6 PB used by from 222 projects Spectrum Scale User Group 6

Job mix Mostly serial, pipeline based workflows Few MPI and multi-node jobs Spectrum Scale User Group 7

Storage structure Single filesystem available as project directories Single IB fabric across two datacenters ~0.5 km apart Flash tier for metadata Storage charge per Sinai policy. Charge based on usage rather than quota. ESS: 3xESS ESS: ESS-BE Building 3xESS 3xGL6 blocks Building Building blocks blocks 5.6 PB GSS: 4xGL6 GSS: GSS: Building 4xGL6 4xGL6 blocks Building blocks blocks 3 PB ESS-LE 1xGL6S Building block 4PB 2014 FLASH 8xFS820 150 TB 2018 FLASH 2xGSS2S 260 TB 2020 DDN: SFA10k 1.4 PB DDN: SFA12k 3.9 PB Spectrum Scale User Group 8

Storage structure The GSS and Flash sub-systems have been our primary workhorses. Storage Avg. Read Ops/Day Avg. Write Ops/Day Avg. Reads /day Writes/day IBM GSS: Data Subsystem (3.5PB) IBM FLASH: Small File/ Metatadata Subsystem (150TB) ~650 million ~385 million ~300 TB ~100TB ~1050 million ~600 million - - Total(Including DDN Subsystems) ~1.8 billion ~1 billion ~500TB ~175TB Spectrum Scale User Group 9

Storage structure The GSS and Flash sub-systems have been our primary workhorses. Organic growth despite recent charge-by-use model. Spectrum Scale User Group 10

Migration plan Pool Total(PB) Used(PB) % Used DDN10k 1.4 0.8 61.02% DDN12k 3.9 1.8 47.78% GSS 3.1 2.7 85.83% ESS - BE 4.1 Migration (Plan A) 1. Install ESS at lower ESS/GPFS(4.1) code level to be compatible with existing GSS (max v.3.5.25 allowed for Lenovo GSS) 2. Split ESS BE recovery group disks between two pools: new ess pool and old gss. 3. Migrate GSS data within gss pool Suspend GSS disks Restripe and rebalance 4. Migrate data from gss -> ess pool 5. Repeat for ddn10k and add remaining ess disks to new ess pool 6. Upgrade GPFS code to 4.1 -> 4.2, update release version, fs version Spectrum Scale User Group 11

Migration plan Pool Total(PB) Used(PB) % Used DDN10k 1.4 0.8 61.02% DDN12k 3.9 1.8 47.78% GSS 3.1 2.7 85.83% ESS - BE 4.1 Migration (Plan A: Take 2) 1. Remove ESS BE Rebuild test cluster with ESS v5.2.0, downgrade GPFS to 4.1.1.15 2. Test/Debug new filesystem. Debug IB issues (To bond or not to bond) Test compatible client OFED levels Upgrade FLASH/DDN GPFS levels 3. Remove disks from test cluster and add them to a new pool in production cluster. Spectrum Scale User Group 12

Migration plan Pool Total(PB) Used(PB) % Used DDN10k 1.4 0.8 61.02% DDN12k 3.9 1.8 47.78% GSS 3.1 2.7 85.83% ESS - BE 4.1 Migration (Plan B) 1. Squeeze allocations. Migrate projects data from 52 known projects. 2. Migrate as much s possible to ddn12k from ddn10k -> ddn12k from gss -> ddn12k 3. Migrate (*) from ddn10k -> ess from gss -> ess 4. Cleanup Change default pool from gss to ess Remove ddn10k, gss disks Spectrum Scale User Group 13

Policy moves ( pool: ddn10k) Pool Total (PB) Before After Before(%) After(%) ddn10k1 1.4 0.8 0 57.13% 3.24% ddn12k1 3.9 2.3 2.3 59.22% 59.15% ess 4.1 0.2 0.9 4.21% 22.41% gss 3.1 2.3 2.3 73.61% 73.51% Evaluating policy rules with CURRENT_TIMESTAMP = 2018-01-08@15:39:23 UTC [I] 2018-01-08@16:49:52.650 Directory entries scanned: 1888732606. [I] Directories scan: 1457460636 files, 271560756 directories, 159711214 other objects, [I] 2018-01-08@17:05:23.719 Statistical candidate file choosing. 498694530 records processed. [I] Summary of Rule Applicability and File Choices: Rule# Hit_Cnt KB_Hit Chosen KB_Chosen KB_Ill Rule 0 498694530 741777261952. 498694530. 741777261952 0 RULE 'migrate10ktoess.1' MIGRATE FROM POOL 'ddn10k1' TO POOL 'ess' [I] A total of 498694530 files have been migrated, deleted or processed by an EXTERNAL EXEC/script; 8336739 'skipped' files and/or errors. [I]2018-01-21@00:16:50.361 Exiting. RC=0. 700T @ 50T/day in 2 weeks Spectrum Scale User Group 14

Policy moves ( pool: gss) Pool_Name Total Used PB (Before) Used PB (After) Before(%) After(%) ddn10k1 1.38 0 0 0.02% 0.02% ddn12k1 3.87 2.75 2.75 71.20% 71.20% ess 4.07 0.9 2.73 22.17% 66.90% gss 3.12 1.84 0.01 58.90% 0.45% Evaluating policy rules with CURRENT_TIMESTAMP = 2018-02-05@16:35:39 UTC [I] 2018-02-05@18:05:04.305 Statistical candidate file choosing. 2813187 records processed. [I] Summary of Rule Applicability and File Choices: Rule# Hit_Cnt KB_Hit Chosen KB_Chosen KB_Ill Rule 0 2813187 1822704400896 2813187 1822704400896 0 RULE 'migrategssktoess.1' MIGRATE FROM POOL 'gss' TO POOL 'ess' WHERE(.) [I] 2018-02-16@14:12:49.891 Executing file list: [E] Error on writeeor to work file localhost:42624: Broken pipe [I] A total of 2503335 files have been migrated, deleted or processed by an EXTERNAL EXEC/script; files and/or errors. 0 'skipped' [I] 2018-03-20@15:44:28.062 Policy execution. 2503335 files dispatched. 2PB @ 35TB/day in 7 weeks Spectrum Scale User Group 15

Policy moves: Things to consider Things for consideration: Running multiple policies(such as purging) can be problematic. On a single fabric, HCA/Switch firmware upgrade can cause several hard IB sweeps. GNR/OFED parameters needed to be carefully checked. Spectrum Scale User Group 16

Archival storage

Archival storage history Month Tape usage Year 1 2 3 4 5 6 7 8 9 10 11 12 Grand Total 2016 203 118 186 139 171 199 99 154 82 132 151 147 1,781 2017 140 77 235 293 288 1,033 Grand Total 343 195 421 432 459 199 99 154 82 132 151 147 2,814 350 300 250 # of Tapes per month # of tapes 200 150 100 50 0 tapes per month Spectrum Scale User Group 18

Archival storage history Statistics for the last 365 days Number of newly registered users= 340 Number of users that did archives = 78 Number of archives = 23,771 Number of retrieves = 7,065 Current storage usage Archived data in TB = 1,495 Primary storage pool in TB = 4,495 Number of tapes used = 5,750 Number of tapes onsite = 2,968 Number of tapes offsite = 2,782 Spectrum Scale User Group 19

Challenges Sinai growth: Ten-fold increase in users/storage since our first GPFS install in 2013. Sinai future directions: Fully HIPAA compliant cluster. Every new ESS install is starting to require a lot more thought and planning xcat support outside of regular Ethernet network integration: VLAN and switch considerations High-speed network integration Data migration and planning GNR configurations present an added complexity for upgrades ESS recovery group stability

Imminent challenges Integrate ESS BE and ESS LE clusters Lack of official support in merging xcat networks GS8052 and Mellanox 8831-S52 trunk xcat table updates and configuration changes (name servers) HMC Migrating the metadata servers

Conclusions/Feedback Its more challenging for admins to maintain a unified filesystem, but its easier for customers. Long time from Deploy status(code 20) to actual production use. Would like to thank IBM CE and Lab Services and dev teams who have been very patient in handling our issues.