Proof of Concept TRANSPARENT CLOUD TIERING WITH IBM SPECTRUM SCALE

Similar documents
AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT

IBM Spectrum Scale Archiving Policies

IBM Spectrum Scale Archiving Policies

IBM Spectrum Archive Solution

Proof of Concept IBM CLOUD OBJECT

BUILD, MODERNIZE AND PROTECT WITH IBM CLOUD PRIVATE

IBM Storwize V7000 Unified

FEBRUARY - MAY 2017 PROOF OF CONCEPT AND CASE STUDY. IBM Spectrum Protect and Backing up to Object Storage in the Cloud

An Introduction to GPFS

REFERENCE ARCHITECTURE Quantum StorNext and Cloudian HyperStore

Introduction to Digital Archiving and IBM archive storage options

An introduction to GPFS Version 3.3

IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage

Insights into TSM/HSM for UNIX and Windows

APRIL 2017 PROOF OF CONCEPT AND CASE STUDY. IBM Spectrum Protect and Backing up to Object Storage in the Cloud

Storage for HPC, HPDA and Machine Learning (ML)

An introduction to IBM Spectrum Scale

MODERNISE WITH ALL-FLASH. Intel Inside. Powerful Data Centre Outside.

Why Datrium DVX is Best for VDI

Configuring IBM Spectrum Protect for IBM Spectrum Scale Active File Management

IBM Spectrum Control. Monitoring, automation and analytics for data and storage infrastructure optimization

Configuring EMC Isilon

TECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF DELL EMC ISILON ONEFS 8.0

High performance and functionality

Hedvig as backup target for Veeam

Системы хранения IBM. Новые возможности

EMC ISILON HARDWARE PLATFORM

The Fastest And Most Efficient Block Storage Software (SDS)

Next Generation Storage for The Software-Defned World

Kaltura Platform: Ultimate Deployment Flexibility

Vendor: IBM. Exam Code: Exam Name: IBM Midrange Storage Technical Support V3. Version: Demo

ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS. By George Crump

IBM Spectrum Scale on Power Linux tuning paper

HPE MSA 2042 Storage. Data sheet

Nový IBM Storwize V7000 Unified block-file storage system Simon Podepřel Storage Sales 2011 IBM Corporation

IBM Storage Software Strategy

Commvault Backup to Cloudian Hyperstore CONFIGURATION GUIDE TO USE HYPERSTORE AS A STORAGE LIBRARY

Executive IT Specialist EMEA Storage Competence Center. Automation of Storage Services. IBM Spectrum Scale IBM Corporation

PROTECTING MISSION CRITICAL DATA

FlashSystem A9000 / A9000R R12.3 Technical Update

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

Enterprise2014. GPFS with Flash840 on PureFlex and Power8 (AIX & Linux)

Effizientes Speichern von Cold-Data

Backup & Recovery on AWS

Data Movement & Tiering with DMF 7

Veeam and Azure Better together. Martin Beran Senior Systems Engineer; Czechia/Slovakia/Hungary

PrepAwayExam. High-efficient Exam Materials are the best high pass-rate Exam Dumps

SONAS Best Practices and options for CIFS Scalability

A Cloud WHERE PHYSICAL ARE TOGETHER AT LAST

Designing elastic storage architectures leveraging distributed NVMe. Your network becomes your storage!

AWS Storage Gateway. Amazon S3. Amazon EFS. Amazon Glacier. Amazon EBS. Amazon EC2 Instance. storage. File Block Object. Hybrid integrated.

Pick your own enablement

Software Defined Storage for the Evolving Data Center

StorNext 3.0 Product Update: Server and Storage Virtualization with StorNext and VMware

PracticeTorrent. Latest study torrent with verified answers will facilitate your actual test

Take control of storage performance

DELL POWERVAULT MD FAMILY MODULAR STORAGE THE DELL POWERVAULT MD STORAGE FAMILY

Evaluating Cloud Storage Strategies. James Bottomley; CTO, Server Virtualization

IBM Spectrum Scale in an OpenStack Environment

Zadara Enterprise Storage in

Deploying Software Defined Storage for the Enterprise with Ceph. PRESENTATION TITLE GOES HERE Paul von Stamwitz Fujitsu

CompTIA CV CompTIA Cloud+ Certification. Download Full Version :

OpenStack SwiftOnFile: User Identity for Cross Protocol Access Demystified Dean Hildebrand, Sasikanth Eda Sandeep Patil, Bill Owen IBM

6/4/2018 Request for Proposal. Upgrade and Consolidation Storage Backup Network Shares Virtual Infrastructure Disaster Recovery

Offloaded Data Transfers (ODX) Virtual Fibre Channel for Hyper-V. Application storage support through SMB 3.0. Storage Spaces

Ready-to-use Virtual Appliance for Hands-on Spectrum Archive Evaluation

3.30pm. A sneak peek at Veeam 2018 releases Veeam for VMware Cloud on AWS technical deep dive Veeam Availability Console Update pm. 2.

AWS Solutions Architect Associate (SAA-C01) Sample Exam Questions

Security & Compliance in the AWS Cloud. Amazon Web Services

White Paper Simplified Backup and Reliable Recovery

Mainframe Backup Modernization Disk Library for mainframe

Advanced Architectures for Oracle Database on Amazon EC2

THE SUMMARY. CLUSTER SERIES - pg. 3. ULTRA SERIES - pg. 5. EXTREME SERIES - pg. 9

AFM Migration: The Road To Perdition

Welcome to Manila: An OpenStack File Share Service. May 14 th, 2014

STORWARE.EU. Simplified Data Protection for Virtual Environments

1 Quantum Corporation 1

Security & Compliance in the AWS Cloud. Vijay Rangarajan Senior Cloud Architect, ASEAN Amazon Web

Dell EMC Data Protection Everywhere

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM

RAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System

Veeam with Cohesity Data Platform

Nimble Storage Adaptive Flash

Exam Name: Midrange Storage Technical Support V2

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

FlexPod. The Journey to the Cloud. Technical Presentation. Presented Jointly by NetApp and Cisco

DELL EMC VXRACK FLEX FOR HIGH PERFORMANCE DATABASES AND APPLICATIONS, MULTI-HYPERVISOR AND TWO-LAYER ENVIRONMENTS

IBM Spectrum Protect HSM for Windows Version Administration Guide IBM

Private Cloud Public Cloud Edge. Consistent Infrastructure & Consistent Operations

Oracle Secure Backup 12.2 What s New. Copyright 2018, Oracle and/or its affiliates. All rights reserved.

Copyright 2012 EMC Corporation. All rights reserved.

Azure Marketplace Getting Started Tutorial. Community Edition

Unified Management for Virtual Storage

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

TS7700 Technical Update TS7720 Tape Attach Deep Dive

Virtualization with Arcserve Unified Data Protection

EMC Isilon. Cisco UCS Director Support for EMC Isilon

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team

IBM Active Cloud Engine centralized data protection

Transcription:

Proof of Concept TRANSPARENT CLOUD TIERING WITH IBM SPECTRUM SCALE ATS Innovation Center, Malvern PA Joshua Kwedar The ATS Group October November 2017

INTRODUCTION With the release of IBM Spectrum Scale 4.2.1, IBM is now offering a hybrid cloud solution to leverage object cloud storage as an additional tier within their ILM (Information Lifecycle Management) engine. As an IBM Business Partner, we ve stood up several IBM Spectrum Scale environments that leverage the ILM engine for data placement and migration using on prem hardware including near-line SAS, Flash, and Spectrum Archive (LTFS). Our customers have expressed an interest in an additional off prem tier for cold data as their rate of data ingest and available rack space does not allow them enough time or space to react to immediate capacity needs. In an effort to enhance our offering, we ve conducted a proof of concept of IBM Spectrum Scale TCT (Transparent Cloud Tiering) in our Innovation Center. 01

Our PoC environment for IBM Spectrum Scale TCT: Power 8 S822. PowerVM. IBM V7000. Red Hat Enterprise Linux 7.3. IBM Spectrum Scale 4.2.3 Advanced Edition. Amazon S3 object storage. The goal of the PoC was the following: Create a 4-node IBM Spectrum Scale cluster consisting of three NSD servers and one Protocol node to be used for TCT. Identify and execute the steps needed to define an object cloud tier within IBM Spectrum Scale. Manually push files(s) to S3. Manually recall file(s) from S3. Create an ILM policy to automatically tier data based on access time and file size when lowdiskspace or nodiskspace callback events occur. A IBM Spectrum Scale 4.2.3 cluster using Transparent Cloud Tiering was configured using the following high level steps: Install IBM Spectrum Scale packages via GUI install for 3 NSD servers and 1 protocol node (NSD servers on Power 8, Protocol server on x86). Create cluster via GUI. Define (4) NSDs. 50GB each. Create filesystem /s3gpfs. 200GB total. Create a Cloudtier nodeclass and add the protocol server to the class. Install the Transparent Cloud Tiering rpm on the protocol server. Enable cloud gateway, start TCT services, assign filesystem /s3gpfs to TCT config. Configure cloud gateway to authenticate with Amazon S3 store. Use ILM policy and callback to migrate data when lowdiskspace and nodiskspace events are triggered. 02

mmlscluster output: mmlnsd output: After the cluster was built, gvics3gpfsprot1 was added to a node class named Cloudtier and the TCT server RPM was installed: Cloud tiering must be enabled on the server performing the TCT function. After running an mmchnode --cloudgateway-enable against the Cloudtier nodeclass, our server will show up in the list of cloud enabled nodes. 03

Start the cloud tiering service by running mmcloudgateway service start N Cloudtier. To verify the service is running, execute the following: At this point, we are ready to define a filesystem for Transparent Cloud Tiering. Only one filesystem can be associated with the TCT node class (Cloudtier). Several of our customers, especially those with SaaS offerings backed by IBM Spectrum Scale, provision multiple GPFS filesystems in a one per customer model. Given the single filesystem limitation, such users may need to get creative in ways that they design for cloud tiering. One option would be to define a single GPFS filesystem solely used to copy data for archive (/gpfs/archive, for example). The filesystem would need to be backed by NSDs and data previously segregated via multiple namespaces would then be mixed together under the same namespace. Another option would be to create multiple nodeclasses that perform TCT functions. A minimum of two nodes running TCT services would be recommended for HA purposes for each filesystem tiering to the cloud. Depending on the number of filesystems you re looking to tier to the cloud, the total of additional TCT servers could grow quite large. If you are using SMB in any combination of other protocols you can configure only up to 16 protocol nodes. This is a hard limit and SMB cannot be enabled if there are more protocol nodes. If only NFS and Object are enabled, you can have 32 nodes configured as protocol nodes. There does not appear to be a documented limited number of nodes solely running TCT services. Define the filesystem to the TCT nodeclass and verify using the following commands: Before defining the S3 account to the configuration, IBM offers a pre-test function to confirm IOPs for puts/gets, estimated throughput and verify the credentials provided. 04

This is a good opportunity to compare your expected WAN throughput against the throughput numbers provided below. Testing noted in future sections of this document prove this estimate to be accurate in terms of throughput: Define the AWS account to be used for TCT and verify: There is no need to manually create any S3 buckets. Within the AWS console, the following newly created S3 buckets are automatically created: Now that the AWS/S3 account and buckets have been created, we can proceed with migrating files to the cloud. In the following screenshots, a 1G file named 0.txt is migrated to S3, it s current state changes from Resident (exists locally) to Non-Resident (exists only in S3), the same file is recalled, and it s state is changed to Co-Resident (exists locally AND in S3). Included are throughput stats to verify estimated values provided during the authentication pre-test. 05

Once recall is completed: When selecting multiple files to migrate to the cloud in a filesystem consisting of several directories and subdirectories, it is important to note that wildcards cannot be provided. If a user were to migrate all files under a directory (for example, /s3gpfs/test) including subdirectories, the following usage would be required: find /s3gpfs/test -type f exec mmcloudgateway files migrate {} + In order to change a file from a Co-Resident state to a Resident state, a policy included under / opt/ibm/mcstore/samples named coresidenttoresident.template can be applied using the mmapplypolicy command. The policy calls policy helper function that invokes an mcstore binary not intended for users to execute. It removes the extended attributes of the file and puts it in a Resident state. 06

The following message is posted during the mmapplypolicy that indicates an mmcloudgateway files reconcile must be executed in order for the changed attribute of the file to be updated in the local cloud directory database. In the next example, a default policy and callback are defined to perform the following functions: Exclude the contents of the internal TCT configuration from being eligible for migration to the cloud (in our example, /s3gpfs/.mcstore and /s3gpfs/.mcstore.bak). Define attributes for access age of a file, size in MB and a weight expression based on age and size of a file. Define an external pool for the cloud tier. Define a migration rule to migrate eligible files to the cloud tier when the filesystem is 95% full until local filesystem utilization is at 90%. Define rules to automatically recall files during user executed read and write operations. A callback is configured to execute an mmapplypolicy and migrate eligible files to the cloud tier when a lowdiskspace (95% as defined in the policy) or nodiskspace (100%) event is logged. 07

08 The following policy was configured using the mmchpolicy command against filesystem s3gpfs:

The following callback configured using the command: mmaddcallback thresholdmigration --event lowdiskspace,nodiskspace --command /usr/lpp/mmfs/bin/mmapplypolicy --parms %fsname -N Cloudtier -g /s3gpfs/test/ --single-instance To test the newly created callback, generate test files until a lowdiskspace event is triggered. 09

When the filesystem reaches the defined lowdiskspace threshold, the following entries are logged in /var/adm/ras/ mmfs.log.latest of the filesystem manager node and the mmapplypolicy is executed as defined in the thresholdmigration callback we defined. For those leveraging other IBM Spectrum Scale features (AFM, LTFS, HSM, FPO, etc), it is important to note the following restrictions and limitations as it relates to Transparent Cloud Tiering: Spectrum Archive (LTFS) and Transparent Cloud Tiering cannot be configured for the same filesystem. Both solutions are intended to serve the same purpose (archive), be sure to plan for one up front. TCT cannot be used to tier snapshots. TCT is not supported in a multi-cluster setup. There is a one filesystem per TCT nodeclass limitation. TCT cannot be run on AFM gateway nodes or configured for use on AFM or AFM DR filesets. There is no mixed cluster support if using Windows/system Z nodes. TCT can co-exist in a Linux cluster comprised of x86 and Power nodes. TCT is not a replacement for a backup or DR solution. An average file size of 1MB is recommended when tiering to the cloud. Smaller file sizes are supported but performance may suffer as a result due to I/O overhead. 10

THEATSGROUP.COM/COMPANY/CONTACT About the ATS Group Since our founding in 2001, the ATS Group has consulted on thousands of system implementations, upgrades, backups and recoveries. We also support customers by providing managed services, performance analysis and capacity planning. With over 60 industrycertified professionals, we support SMBs, Fortune 500 companies, and government agencies. As experts in IBM, VMware, Oracle and other top vendors, we are experienced in virtualization, storage area networks (SANs), high availability, performance tuning, SDS, enterprise backup and other evolving technologies that operate mission-critical systems on premise, in the cloud, or in a hybrid environment.