Is it a good idea to create an HSM without tapes?

Similar documents
SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

designed. engineered. results. Parallel DMF

IBM Storwize V7000 Unified

Data Movement & Tiering with DMF 7

Using Cloud Services behind SGI DMF

Hitachi HH Hitachi Data Systems Storage Architect-Hitachi NAS Platform.

Shared Object-Based Storage and the HPC Data Center

LUG 2012 From Lustre 2.1 to Lustre HSM IFERC (Rokkasho, Japan)

Easily Recreate the Qtier SSD Tier

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

RAIDIX 4.5. Product Features. Document revision 1.0

Storage for HPC, HPDA and Machine Learning (ML)

SGI InfiniteStorage Data Migration Facility (DMF) Virtualizing Storage to Enable Active Archiving

COPAN TM VTL for DMF Quick Start Guide

HPE MSA 2042 Storage. Data sheet

HITACHI VIRTUAL STORAGE PLATFORM G SERIES FAMILY MATRIX

The World s Fastest Backup Systems

Mass Storage at the PSC

TS7700 Technical Update What s that I hear about R3.2?

1Z0-433

Discover the all-new CacheMount

Enterprise power with everyday simplicity

The QM2 PCIe Expansion Card Gives Your NAS a Performance Boost! QM2-2S QM2-2P QM2-2S10G1T QM2-2P10G1T

BIG DATA STRATEGY FOR TODAY AND TOMORROW RYAN SAYRE, EMC ISILON CTO-AT-LARGE, EMEA

Dell PowerVault MD Family. Modular storage. The Dell PowerVault MD storage family

Moving 6PB of data 2.5m West - NCI's migration from LTO-5 to TS1150

Enterprise power with everyday simplicity

Vendor: IBM. Exam Code: Exam Name: IBM Midrange Storage Technical Support V3. Version: Demo

Insights into TSM/HSM for UNIX and Windows

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

Nový IBM Storwize V7000 Unified block-file storage system Simon Podepřel Storage Sales 2011 IBM Corporation

SCS Distributed File System Service Proposal

Parallel File Systems Compared

XtreemStore A SCALABLE STORAGE MANAGEMENT SOFTWARE WITHOUT LIMITS YOUR DATA. YOUR CONTROL

Accelerating Spectrum Scale with a Intelligent IO Manager

SPECIFICATION FOR NETWORK ATTACHED STORAGE (NAS) TO BE FILLED BY BIDDER. NAS Controller Should be rack mounted with a form factor of not more than 2U

SurFS Product Description

Virtual Storage Tier and Beyond

Parallel File Systems. John White Lawrence Berkeley National Lab

Symantec NetBackup PureDisk Compatibility Matrix Created August 26, 2010

HH0-450 Exam Questions Demo Hitachi. Exam Questions HH0-450

Xcellis Technical Overview: A deep dive into the latest hardware designed for StorNext 5

The current status of the adoption of ZFS* as backend file system for Lustre*: an early evaluation

HITACHI VIRTUAL STORAGE PLATFORM F SERIES FAMILY MATRIX

DMF Administrator s Guide for SGI InfiniteStorage

Emerging Technologies for HPC Storage

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

XenData Product Brief: SX-550 Series Servers for LTO Archives

NST6000 UNIFIED HYBRID STORAGE. Performance, Availability and Scale for Any SAN and NAS Workload in Your Environment

SMART SERVER AND STORAGE SOLUTIONS FOR GROWING BUSINESSES

The storage challenges of virtualized environments

QNAP OpenStack Ready NAS For a Robust and Reliable Cloud Platform

Experiences using a multi-tiered GPFS file system at Mount Sinai. Bhupender Thakur Patricia Kovatch Francesca Tartagliogne Dansha Jiang

Hitachi Adaptable Modular Storage and Workgroup Modular Storage

AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT

Coordinating Parallel HSM in Object-based Cluster Filesystems

VSTOR Vault Mass Storage at its Best Reliable Mass Storage Solutions Easy to Use, Modular, Scalable, and Affordable

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic

SXL-8500 Series: LTO Digital Video Archives

DDN. DDN Updates. DataDirect Neworks Japan, Inc Nobu Hashizume. DDN Storage 2018 DDN Storage 1

SXL-4205Q LTO-8 Digital Archive

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team

Isilon Scale Out NAS. Morten Petersen, Senior Systems Engineer, Isilon Division

Feedback on BeeGFS. A Parallel File System for High Performance Computing

DDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage

Experiences in Clustering CIFS for IBM Scale Out Network Attached Storage (SONAS)

Update on SAM-QFS: Features, Solutions, Roadmap. Michael Selway Sr. Consultant Engineer ILM Software Sales Sun Microsystems Sun.

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

IBM i supports additional IBM POWER6 hardware

From an open storage solution to a clustered NAS appliance

Warsaw. 11 th September 2018

HPC Growing Pains. IT Lessons Learned from the Biomedical Data Deluge

Tiered IOPS Storage for Service Providers Dell Platform and Fibre Channel protocol. CloudByte Reference Architecture

Hitachi Adaptable Modular Storage and Hitachi Workgroup Modular Storage

Dell PowerVault MD Family. Modular storage. The Dell PowerVault MD storage family

HPSS Treefrog Summary MARCH 1, 2018

ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS. By George Crump

SONAS Performance: SPECsfs benchmark publication

BIG DATA READY WITH ISILON JEUDI 19 NOVEMBRE Bertrand OUNANIAN: Advisory System Engineer

Three Generations of Linux Dr. Alexander Dunaevskiy

Storage Technologies - 3

White Paper. A System for Archiving, Recovery, and Storage Optimization. Mimosa NearPoint for Microsoft

2012 Enterprise Strategy Group. Enterprise Strategy Group Getting to the bigger truth. TM

SFA12KX and Lustre Update

HyperScalers JetStor appliance with Raidix storage software

Proof of Concept TRANSPARENT CLOUD TIERING WITH IBM SPECTRUM SCALE

Data Movement & Storage Using the Data Capacitor Filesystem

An Introduction to GPFS

Hitachi HQT-4210 Exam

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

PrepAwayExam. High-efficient Exam Materials are the best high pass-rate Exam Dumps

The Genesis HyperMDC is a scalable metadata cluster designed for ease-of-use and quick deployment.

High-Performance Lustre with Maximum Data Assurance

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

SUSE OpenStack Cloud Production Deployment Architecture. Guide. Solution Guide Cloud Computing.

EMC Integrated Infrastructure for VMware. Business Continuity

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Lustre A Platform for Intelligent Scale-Out Storage

Deep Dive on SimpliVity s OmniStack A Technical Whitepaper

Transcription:

Is it a good idea to create an HSM without tapes?

Previous Data Requirements and Assumptions Started life as a scientist so still try to think like one Biologists generally have modest IT needs except for BioInformatics BioImaging Previously at MIT managing one BioImaging lab 3 x Optical microscopes, 1 x Electron microscope Detectors collected ~ 10-100 MB/s 1 PI and 20-30 users resulted in - 100 TB / 5 years Coming to Singapore managing 2 Research Centres 30+ Optical microscopes, 4 x Electron microscopes Detectors can collect 1+ GB/s 25 PIs and 300 users Planned for 10x growth over MIT lab Anticipated 1 PB / 5 years SMALL server room, 2-3 racks storage

How to design a PB storage system Basic storage design criteria Try to Balance the following: Capacity Performance Cost Gating Factors Ease of Management Can t lose any user data Maximize Capacity Use Storage Tiering 2 Maximize Performance 1 Minimize Cost

How to design a PB storage system Non- Tape based Tiered Storage Solution Performance Capacity Cost Tier 1 Maximum Minimum High Tier 2 Moderate Moderate Moderate Moderate Tier 3 Minimum High Low HSM to manage data movement between the tiers.

How to design a PB storage system 2010-2014 HDD characteristics Performance SAS excellent for types of I/O: sequential, random, large, small SATA excellent for sequential, large I/O NL-SAS excellent for sequential, large, I/O, good for random, small I/O Capacity SATA, NL-SAS 2, 3 TB Cost of Tier 1- SAS $1000+ / TB 2- SATA, NL-SAS - ~ $500-700 / TB Est. cost for 500 TB = ~ $500K /lab fit in my 5yr budgets! 2014 4 TB, Enterprise, NL-SAS < $100/TB

How to design a PB storage system HDD Capacity Time Trend LTO'1 LTO'7

How to design a PB storage system DMF Characteristics - 2009 Tier 1 XFS filesystem needs DMAPI mount option (future?) Tier 2 MSP -> DCM MSP XFS filesystem needs DMAPI Tier 3 FTP / Tape / COPAN MAID / Cloud (201?)? DMF compatible with NAS/NFS based storage devices 2012 SGI-DMF_5.6_AdminGuide 007-5484-010 p. 11 DMFinteroperateswiththefollowing: StandarddataexportservicessuchasNetwork(File(System((NFS)( andfiletransferprotocol(ftp) XFS filesystems CXFSTM(clusteredXFS)filesystems NO mention as Tier 3 target Still TRUE in 2015 DMF v6

1 GbEthernet MBI System layout 10 Gb Ethernet 8 Gb FibreChannel NUS Private Network 1 Gb Ethernet L 9, 10, 11 L5 User WKS User WKS Microscope Core Switch Compute Server Microscope MBI Private Network 1 Gb Ethernet MBI private Network 10 Gb Ethernet Compute Server Storage Server Storage network 8 Gb Fibre Channel Storage Server

Storage System Evolution 2010 to 2014 2010 MBI, 1 tier + NAS (63-0-24 TB) using 1x DDN 6600 ( 2TB SATA) w SSD for XFS inodes 2010 CBIS, 2 tiers, (31-110-0 TB) with DMF, using 1x DDN 6600 2011 MBI capacity expansion (123-0-24 TB) DDN 6600 exp encl 2012 CBIS capacity expansion (28-219-0 TB), 2nd DDN 6600 2013 MBI, 2nd tier & capacity expansion (28-214-48 TB) with NetAPP 5500 2014 MBI, 3rd tier using cluster of storage servers, 4x 64 TB (256 TB) using BeeGFS eval d Ceph FS, thought about Lustre, dismissed GlusterFS and GPFS 2014 CBIS, 3rd tier with SGI MIS box, (224 TB)

Storage System Evolution 2010 to 2014 Total Storage Usage: MBI and CBIS Tier 1 Tier 2 Tier 3 Total (TB) MBI 23 168 160 351 CBIS 24 197 156 377 Total 47 / 55 365 / 433 316 / 480 728 / 968

Storage System Evolution 2010 to 2014 Tier Performance Tier 1 R5 SAS (4+1) Tier 2 R6 NL-SAS (8+2) Tier 3 NFS MBI 1 2* GB/s.5 1* GB/s.2 -.3 GB/s 15 TB/d CBIS 1-2 GB/s.5-1 GB/s.2 -.5 GB/s 20 TB/d * Upgrading to DDN 7700 = 6-8 GB/s

Storage System Usage MBI/CBIS Storage Usage - 3 tiers: Images (28+28 TB), Tier2 (214+219 TB), Tier3 (256+224 TB) 1200$ Total$Storage$Max$ Total$Storage$Used$ 1000$! DMF Tier 1! DMF Tier 2! DMF Tier 3 Images$Max$ Images$Used$ Store$T2$Max$ 800$ Store$T2$Used$ 32 TB/mo Store$T3$MAX$ 18 TB/mo Store$T3$Used$ 600$ 11 TB/mo 400$ 200$ 0$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 13 13 13 13 13 13 13 13 13 13 13 13 14 14 14 14 14 14 14 14 14 14 14 14 15 15 /4/ 1/5/ 0/6/ 1/7/ 1/8/ 0/9/ /10/ /11/ /12/ 1/1/ 9/2/ 1/3/ 0/4/ 1/5/ 0/6/ 1/7/ 1/8/ 0/9/ /10/ /11/ /12/ 1/1/ 8/2/ 1/3/ 0/4/ 1/5/ 0/6/ 1/7/ 1/8/ 0/9/ /10/ /11/ /12/ 1/1/ 8/2/ 1/3/ 0/4/ 1/5/ 0/6/ 1/7/ 1/8/ 0/9/ /10/ /11/ /12/ 1/1/ 8/2/ 0 3 3 3 3 3 3 31 30 31 3 2 3 3 3 3 3 3 3 31 30 31 3 2 3 3 3 3 3 3 3 31 30 31 3 2 3 3 3 3 3 3 3 31 30 31 3 2 MBI$Data$Increase$(TB$/$Mon)$

2.0E+7 FileSize-Distribu7on-(FS:-images)- 1.8E+7 1.6E+7 Komodo Num-of-Files-- 1.4E+7 1.2E+7 1.0E+7 8.0E+6 6.0E+6 4.0E+6 2.0E+6 0.0E+0 FileSize-Range- Satay

Current DMF configuration with thanks from Susheel Gokahle I didn t lose any data during this process! Based on DMF sample file dmf.conf.dcm define daemon TYPE dmdaemon MIGRATION_LEVEL auto The following parameter must not be changed while DMF is running! MSP_NAMES dsk1 dsk1_t3l dsk1_t3a * dmcheck complains TASK_GROUPS daemon_tasks dump_tasks MOVE_FS /dmf/spare EXPORT_QUEUE ON MESSAGE_LEVEL 2 Turn off partial access to files being recalled when used with CXFS, see DMF Admin Guide RECALL_NOTIFICATION_RATE 0 enddef

Current DMF configuration Tier 1 define /mnt/mbi/images TYPE MIGRATION_LEVEL POLICIES USE_UNIFIED_BUFFER BUFFERED_IO_SIZE DUMP_MIGRATE_FIRST MESSAGE_LEVEL filesystem auto space_policy msp_policy OFF 1048576 OFF 2 space_policy TYPE FREE_SPACE_MINIMUM FREE_SPACE_TARGET MIGRATION_TARGET FREE_DUALSTATE_FIRST AGE_WEIGHT AGE_WEIGHT SPACE_WEIGHT SPACE_WEIGHT policy 10 20 95 OFF -1 1-1 0 enddef define enddef 0 when uid in (root mbiftp mbitest) 1 0 when size <= 65536.00000001

Current DMF configuration Tier 2 DCM define enddef Define the "dcm_msp" as a disk MSP. Keep the same name "dsk1" as before in the 2-tier config dsk1 TYPE COMMAND MIGRATION_LEVEL POLICIES TASK_GROUPS STORE_DIRECTORY BUFFERED_IO_SIZE CHILD_MAXIMUM GUARANTEED_GETS NAME_FORMAT MESSAGE_LEVEL DUMP_FLUSH_DCM_FIRST WRITE_CHECKSUM msp dmdskmsp auto dcm_space_policy dcm_tasks /dmf/store 1048576 12 4 %u/%y/%m/%d/%b 2 OFF ON

Current DMF configuration Tier 2 DCM define enddef Define how the cache will be managed, writing 2 copies to archive storage. dcm_space_policy TYPE FREE_SPACE_MINIMUM FREE_SPACE_TARGET DUALRESIDENCE_TARGET FREE_DUALRESIDENT_FIRST CACHE_AGE_WEIGHT CACHE_AGE_WEIGHT CACHE_SPACE_WEIGHT CACHE_SPACE_WEIGHT SELECT_LOWER_VG SELECT_LOWER_VG SELECT_LOWER_VG policy 5 10 40 ON -1 0 when 1 1-1 0 when 0.000000001 none when dsk1_mgl when dsk1_mgr age < 180 size <= 262144 softdeleted = true size < 1 *dmcheck fix *dmcheck fix

Current DMF configuration Tier 3 NFS define Define a migrate group that is comprised of the 2 DISK MSPs dsk1_mgl TYPE GROUP_MEMBERS ROTATION_STRATEGY migrategroup dsk1_t3l SEQUENTIAL enddef define enddef For duplicate copies at remote site directory on SGI MIS at CBIS dsk1_mgr TYPE GROUP_MEMBERS ROTATION_STRATEGY migrategroup dsk1_t3a SEQUENTIAL

Current DMF configuration Tier 3 NFS define enddef Define the dsk1_tier3 msps. You must modify the STORE_DIRECTORY parameter to a value appropriate for your site. The remote sites are NFS mounted directories dsk1_t3l TYPE COMMAND STORE_DIRECTORY FULL_THRESHOLD_BYTES CHILD_MAXIMUM NAME_FORMAT MESSAGE_LEVEL WRITE_CHECKSUM msp dmdskmsp /dmf/store_tier3l 255750000000000 8 %u/%y/%m/%d/%b 2 ON

Current DMF configuration Tier 3 NFS For duplicate copies at remote site directory on SGI MIS at CBIS define dsk1_t3a TYPE msp COMMAND dmdskmsp STORE_DIRECTORY /dmf/store_tier3a FULL_THRESHOLD_BYTES 223900000000000 CHILD_MAXIMUM 8 NAME_FORMAT %u/%y/%m/%d/%b MESSAGE_LEVEL 2 WRITE_CHECKSUM ON enddef

Current DMF configuration - dmcheck Checking DMF installation. Linux satay 3.0.101-0.35-default 1 SMP Wed Jul 9 11:43:04 UTC 2014 (c36987d) x86_64 x86_64 x86_64 GNU/Linux - satay SuSE-release: SUSE Linux Enterprise Server 11 (x86_64) SuSE-release: VERSION = 11 SuSE-release: PATCHLEVEL = 3 sgi-issp-release: SGI InfiniteStorage Software Platform, version 3.2, Build 710r1.sles11sp3-1407021914 sgi-foundation-release: SGI Foundation Software 2.10, Build 710r16.sles11sp3-1404092103 lsb-release: LSB_VERSION="core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0x86_64:core-3.2-x86_64:core-4.0-x86_64" DMF version 6.2.0 rpm dmf-6.2.0-sgi320rp32.sles11sp3 installed. Checking DMF config file /etc/dmf/dmf.conf Scanning for non-comment lines outside define/enddef pairs Scanning for DMF parameters without values Checking all objects for invalid names Checking base Checking DMF license Total bytes managed 505105182990336 (505TB) Total charged to license 349955270135890 (349TB) DMF license capacity 500000000000000 (500TB) Percent of license capacity 69

Current DMF configuration - dmcheck Checking daemon Checking policy dcm_space_policy Checking policy msp_policy WARNING: Some files will only have one copy made when migrated. Checking policy space_policy Checking filesystem /mnt/mbi/images Checking filesystem /dmf/archive Checking MSP dsk1 (DCM-mode) WARNING: dsk1 follows LS dsk1_t3l (containing VG dsk1_t3l) on the LS_NAMES/MSP_NAMES parameter; recalls will bypass it Checking MSP dsk1_t3l Checking MSP dsk1_t3a Checking Migrate Group dsk1_mgl Checking Migrate Group dsk1_mgr Checking selection rules in policy space_policy. Checking selection rules in policy msp_policy. Checking selection rules in policy dcm_space_policy. * dmcheck issue?

Current DMF configuration - dmcheck Checking Checking Checking Checking Checking Services services Task Group daemon_tasks Task Group dump_tasks Task Group dcm_tasks for unreferenced objects Checking other daemons. Checking chkconfig WARNING: dmf is disabled by chkconfig No Errors found. 3 warnings found. ---Note: fixed a dmcheck bug unable to find xvm local volumes (eg /dev/lxvm/dmfstore) when CXFS is running DMF DOESN T NOT complain

Future Storage Growth 2010 thru 2014 Resulted in 728 TBs for 2015-2019 projecting: 300-400 TBs / yr Do the original assumptions still hold? Can continue with HDD based tier 3s to: 2, 3, 5, PBs? Storage vendors project HDDs to 50+ TB in a few years! Do we want to use drives that BIG? Need more intelligent drive rebuilds Standard Raid 5 or 6 is inadequate Raid pools for rebuilds How will that affect tuning and I/O optimizations

Future Storage Growth Future of storage drives, systems and software Convergence of Storage Servers and Storage Arrays Need to reduce the movement of data between tiers Combine the functionality of storage servers and storage arrays Put storage filesystem software onto the storage arrays HSM tiering Can we converge some number of tiers ( eg 2 & 3 ) Can we use HDDs for archival tiers of capacity: 10, 20, 50 PBs? Can it provide the reliability, DR to remote site, restoration times, etc Singapore is purchasing a 1-3 PFLOP SC with 10-20 PBs of storage Storage vendors may be proposing an all HDD based solution

DMF Without Tapes Was it a good idea Comments Questions?? Al Davis aldavis@nus.edu.sg