AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT

Similar documents
Proof of Concept TRANSPARENT CLOUD TIERING WITH IBM SPECTRUM SCALE

AWS Solutions Architect Associate (SAA-C01) Sample Exam Questions

Pass4test Certification IT garanti, The Easy Way!

LINUX, WINDOWS(MCSE),

Next Generation Storage for The Software-Defned World

Building a Modular and Scalable Virtual Network Architecture with Amazon VPC

NGF0502 AWS Student Slides

CPM. Quick Start Guide V2.4.0

Cloudera s Enterprise Data Hub on the AWS Cloud

PrepAwayExam. High-efficient Exam Materials are the best high pass-rate Exam Dumps

Installation and User Guide

TECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF EMC ISILON ONEFS 7.1.1

Amazon AWS-Solution-Architect-Associate Exam

High School Technology Services myhsts.org Certification Courses

Hedvig as backup target for Veeam

REFERENCE ARCHITECTURE Quantum StorNext and Cloudian HyperStore

OnCommand Cloud Manager 3.2 Deploying and Managing ONTAP Cloud Systems

HOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION

Oracle WebLogic Server 12c on AWS. December 2018

Backtesting in the Cloud

Microsoft Azure for AWS Experts

Amazon Web Services. Block 402, 4 th Floor, Saptagiri Towers, Above Pantaloons, Begumpet Main Road, Hyderabad Telangana India

25 Best Practice Tips for architecting Amazon VPC

At Course Completion Prepares you as per certification requirements for AWS Developer Associate.

Introduction to Cloud Computing

IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage

SIOS DataKeeper Cluster Edition on the AWS Cloud

Proof of Concept IBM CLOUD OBJECT

Zadara Enterprise Storage in

Cloudera s Enterprise Data Hub on the Amazon Web Services Cloud: Quick Start Reference Deployment October 2014

dbx MNT AWS Setup Guide

Oracle 1Z Oracle Cloud Solutions Infrastructure Architect Associate.

AWS Storage Gateway. Amazon S3. Amazon EFS. Amazon Glacier. Amazon EBS. Amazon EC2 Instance. storage. File Block Object. Hybrid integrated.

TECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF DELL EMC ISILON ONEFS 8.0

Confluence Data Center on the AWS Cloud

EDB Postgres Enterprise Manager EDB Ark Management Features Guide

Data Protection for Virtualized Environments

ArcGIS 10.3 Server on Amazon Web Services

BUILD, MODERNIZE AND PROTECT WITH IBM CLOUD PRIVATE

25 Best Practice Tips for architecting Amazon VPC. 25 Best Practice Tips for architecting Amazon VPC. Harish Ganesan- CTO- 8KMiles

Resiliency Replication Appliance Installation Guide Version 7.2

Training on Amazon AWS Cloud Computing. Course Content

Veeam with Cohesity Data Platform

SOLUTION BRIEF Fulfill the promise of the cloud

Netflix OSS Spinnaker on the AWS Cloud

FEBRUARY - MAY 2017 PROOF OF CONCEPT AND CASE STUDY. IBM Spectrum Protect and Backing up to Object Storage in the Cloud

Technical Brief. Adding Zadara Storage to VMware Cloud on AWS


Cloud Computing /AWS Course Content

AWS_SOA-C00 Exam. Volume: 758 Questions

Microsoft Windows Server Failover Clustering (WSFC) and SQL Server AlwaysOn Availability Groups on the AWS Cloud: Quick Start Reference Deployment

EDB Postgres Enterprise Manager EDB Ark Management Features Guide

Red Hat Storage Server for AWS

We are ready to serve Latest IT Trends, Are you ready to learn? New Batches Info

Oracle DBA workshop I

How to host and manage enterprise customers on AWS: TOYOTA, Nippon Television, UNIQLO use cases

Cloud and Storage. Transforming IT with AWS and Zadara. Doug Cliche, Storage Solutions Architect June 5, 2018

Web Cloud Solution. User Guide. Issue 01. Date

Amazon Web Services Training. Training Topics:

Amazon Web Services (AWS) Solutions Architect Intermediate Level Course Content

Amazon. Exam Questions AWS-Certified-Solutions-Architect- Professional. AWS-Certified-Solutions-Architect-Professional.

Puppet on the AWS Cloud

ThoughtSpot on AWS Quick Start Guide

THE ZADARA CLOUD. An overview of the Zadara Storage Cloud and VPSA Storage Array technology WHITE PAPER

CPM Quick Start Guide V2.2.0

Advanced Architectures for Oracle Database on Amazon EC2

Commvault Backup to Cloudian Hyperstore CONFIGURATION GUIDE TO USE HYPERSTORE AS A STORAGE LIBRARY

Amazon Web Services (AWS) Training Course Content

AWS Storage Gateway. Not your father s hybrid storage. University of Arizona IT Summit October 23, Jay Vagalatos, AWS Solutions Architect

AWS Administration. Suggested Pre-requisites Basic IT Knowledge

Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES

Course Outline. Module 1: Microsoft Azure for AWS Experts Course Overview

2013 AWS Worldwide Public Sector Summit Washington, D.C.

An Introduction to GPFS

Nimble Storage Adaptive Flash

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

JIRA Software and JIRA Service Desk Data Center on the AWS Cloud

AWS Solution Architect Associate

Getting Started with AWS Security

F5 BIG-IQ Centralized Management and Amazon Web Services: Setup. Version 5.4

CogniFit Technical Security Details

Security & Compliance in the AWS Cloud. Amazon Web Services

How can you implement this through a script that a scheduling daemon runs daily on the application servers?

Introduction to cloud computing

Standardized Architecture for PCI DSS on the AWS Cloud

About Intellipaat. About the Course. Why Take This Course?

IBM Spectrum Scale in an OpenStack Environment

Introduction to Amazon Cloud & EC2 Overview

HPE Digital Learner AWS Certified SysOps Administrator (Intermediate) Content Pack

SoftNAS Cloud Data Management Products for AWS Add Breakthrough NAS Performance, Protection, Flexibility

powered by Cloudian and Veritas

Arcserve Solutions for Amazon Web Services (AWS)

IBM Storwize V7000 Unified

Deploying High Availability and Business Resilient R12 Applications over the Cloud

Cisco Cloud Services Router 1000V and Amazon Web Services CASE STUDY

ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS. By George Crump

Modernize Your Backup and DR Using Actifio in AWS

Get the Most Out of GoAnywhere: Achieving Cloud File Transfers and Integrations

Easy VMware Disaster Recovery & Business Continuity in Amazon Web Services

Aurora, RDS, or On-Prem, Which is right for you

Transcription:

AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT By Joshua Kwedar Sr. Systems Engineer By Steve Horan Cloud Architect ATS Innovation Center, Malvern, PA Dates: Oct December 2017

INTRODUCTION As an IBM premier business partner, we are always looking for creative ways to meet our customer s needs. The demand for low cost, quickly deployed solutions is ever increasing in today s cloud first climate. Many of our customers have invested a significant amount of money into HPC solutions with node counts that can reach into the thousands and have found themselves feeling locked in to a specific vendor with no quick way to elastically grow or shrink based on the demand of their workloads. In terms of hybrid growth, we ve researched IBM Spectrum Scale s Transparent Cloud Tiering feature which allows for an S3 object store to serve as a storage tier within the same Scale namespace. While customers see the value in growing their storage footprint on demand, a subset of them view this feature as a stop gap to their ultimate goal of moving their storage as well as their compute into the cloud. We have built Scale clusters across multiple cloud providers (Google Cloud Platform, Amazon Web Services), but have focused in on AWS after IBM s Spectrum Scale on AWS Quick Start trial evaluation was released in September 2017. THE GOAL OF THE POC WAS THE FOLLOWING: Review the AWS / Spectrum Scale architecture layout. Create a Spectrum Scale cluster using IBM s Quick Start guide. Review cloud formation templates and scripts to better understand steps to automate cluster build in AWS. Execute failure scenarios and observe behavior. Determine any restrictions in the trial release. Improve cloud formation templates, scripts, restrictions based on observations. Create customized AMIs with updated Spectrum Scale versions. 01

A Spectrum Scale cluster deployed in AWS consists of the following components: 1. A VPC (Virtual Private Cloud) is defined within AWS. All instances exist in the VPC. Deployment templates allow a cluster to be deployed within an existing VPC, or for a new VPC to be created. The VPC defaults to span two availability zones. 2. A single Bastion host is created within a single availability zone in a public subnet. Think of a Bastion host as a jump server or an admin server. It serves no function within the Spectrum Scale cluster other than a means to SSH into cluster nodes from outside of the AWS VPC. a. Note that one of the Bastion hosts is greyed out in the image above. The Bastion stack has autoscaling configured by default to ensure there is always at least one Bastion host up and running. Without an accessible Bastion host, you would be unable to SSH to any cluster nodes. 3. NAT gateways are defined in the public subnet (Bastion stack) to allow outbound internet access for nodes in the private subnet (Server and Compute instances). 4. AWS Identity and Access Management (IAM) role and Security groups are automatically created to allow ports for SSH and the Spectrum Scale daemon. 02

5. The IBM Quickstart allows users to specify EC2 types and quantity for the Server (NSD Server) and Compute nodes to be configured as a part of the cluster. a. NSD Servers i. Minimum = 2 ii. Maxiumum = 64 b. Compute nodes i. Minimum = 1 ii. Maximum = 64 6. One 100GB disk is allocated per EC2 instance to be used as the root volume. Users can specify the size, quantity, and type of EBS volumes to be used as NSDs (Network Shared Disks). The currently supported disk sizes the template provides are 10GB-16384GB. EBS Volume Types for NSD use can only be allocated as either gp2 (general purpose), io1 (high performance SSD) or standard (HDD). If a user specifies 2 NSD servers and 1 compute node, a 5GB EBS volume is allocated to the compute node to account for quorum. 7. A filesystem name, block size (all supported Spectrum Scale block sizes can be specified) and number of replicas (max 2) must be provided within the template. The filesystem default number of replicas and NSD failure group definitions are automatically configured based on user inputs. 03

Notes Regarding Architecture The template creates a synchronous, highly available Spectrum Scale cluster across two availability zones, but does not account for third site quorum. A single site will always have a majority quorum definition when an odd number of quorum nodes are specified using this architecture. Only a single EBS volume type can be specified for use. Spectrum Scale allows users to split metadata from data. Metadata is often placed on faster volumes (io1) for high response during metadata lookups. The cloud template places metadata and data on the same volumes and sets the maximum/default replicas to 1 or 2 based on user input. The maximum number of replicas for Spectrum Scale filesystems is 3. The template only allows for 2, as only two availability zones are able to be specified during cluster creation. Autoscaling groups are created for the Bastion, Server and Compute stacks but need to be configured to take any action beyond satisfying the minimum number of nodes within each stack. For example, a CPU % used threshold needs to be user defined. There is no input for GPFS cluster name. 04

The Following Inputs Were Provided to The Template in Order to Create a Test Cluster: A filesystem block size of 16M is selected. Allowable values are: 256k, 512k, 1M, 2M, 4M, 8M, 16M. The minimum number of NSD Servers and two Compute nodes are selected for testing purposes. 05

VPC, Private and Public CIDR block entries in the Network section of the form are prepopulated but can be changed if desired. A user must select at least 2 availability zones and defined an External CIDR block. For the purposes of testing, I have specified 0.0.0.0/0 to allow all public traffic. In an actual implementation, you would specify a corporate network CIDR Block range. A key pair name, S3 bucket and operator email must be supplied. Other values can be modified but are prepopulated. While the Bastion instance type can be changed, it is simply a jump/admin server and does not need to be configured with any substantial amount of resources. 06

The following default options were taken: After review of the user supplied inputs, the overall stack can be created. The progress of each individual stack can be monitored within the AWS console. See progression below with timestamps: 07

Once the status for each stack listed above is CREATE_COMPLETE the EC2 instances containing a fully functioning Spectrum Scale filesystem are accessible. Creation time of each stack depends on the user supplied inputs. In our example, the entire process took ~10 minutes. Instance/cluster creation time increases as nodes/ number of disks increase. EC2 instance information by accessing the AWS console -> EC2 Instances. Our cluster consists of the following hosts: A column titled Public IP shows that only the LinuxBastion host has been assigned a public address. The Bastion host can be accessed using the ec2-user account and passing your AWS key to the Bastion IP. From there, you can SSH to any Server/Compute node. From a cluster node, we can view the output of mmlscluster and mmlsnsd to review the Cluster Name, Repository Type, Node names, Node designation, NSD names, NSD to Filesystem allocations and NSD servers per NSD. 08

Based on this output, we can determine that one NSD server and one Compute node are placed in the 10.0.1.X (Availability Zone us-east-2a) and the other NSD server and compute node are placed in the 10.0.3.X private network (Availability Zone us-east-2b). Note that two NSD servers and a single compute node are designated as quorum nodes. A single 5GB desconly disk is allocated to compute node ip-10-0-1-132.us-east-2.compute.internal to serve as a quorum disk, however, if availability zone us-east-2a were to go offline, the filesystem would be inaccessible due to loss of quorum. The script defines NSDs with a naming convention that includes availability zone in order to make it much easier for user to determine which zone an NSD resides. In this example, nsd_2a_1_0 is served out by NSD server ip-10-0-1-208.us-east-2.compute.internal from availability zone us-east-2a. 09

Total filesystem size is 20G (2 x 10G disk one per NSD Server) with a replication factor of 2 for data and metadata. The filesystem is created with the maximum number of replicas (3) for data and metadata allowed by Spectrum Scale. Autoscaling groups can be viewed by navigating to Auto Scaling -> Auto Scaling Groups with the EC2 instance view. In our example, three autoscaling groups are created for the Bastion, Server and Compute stacks. Each stack has a minimum, desired and maximum definition derived from the user inputs supplied in the Cloud Formation template. Other auto scaling rules tied to CPU/Memory utilization, for example, can be manually defined. A basic test of terminating a compute node demonstrates the functionality of the autoscaling groups that were created by the template. A compute node is termined: 10

A new compute instance spins up to satisfy the rules of the Compute stack autoscaling group: The new instance is ready for use: We can now SSH to the server, verify that its just been built and that the AMI containing Spectrum Scale packages has been used to build the instance. However, the GPFS daemon is not running. 11

Attempts to run mmlscluster on the newly created compute node show that it does not belong to a Spectrum Scale cluster. The same command run on an existing NSD Server node verifies that the new node has not been added to the cluster. The terminated compute node (ip-10-0-3-222.us-east-2. compute.internal) is still a member of the cluster configuration. At this time, there is no functionality built into the template/stacks to automatically add and remove newly generated instances from the cluster configuration. Nodes would need to be manually added and removed using mmaddnode and mmdelnode commands. In the event of quorum node loss, new quorum nodes would need to be designated. In the event of NSD server loss, steps would need to be taken to reestablish optimal striping of data across newly created NSDs (mmrestripefs). Users may want to automate certain functions listed above (automatic add of new nodes to the cluster) while other administrative tasks (mmrestripefs) may be better off being run manually during a maintenance window, for example. With Regard to The Current Version of The AWS Spectrum Scale Trial Cloud Formation Template, The Following Restrictions Exist: Protocol support, including the use of Cluster Export Services (CES) nodes and protocol access such as Network File System (NFS), Object, and Server Message Block (SMB). Active File Management (AFM). Transparent Cloud Tiering (TCT). Compression. Encryption. Data Management API (DMAPI) support, including Hierarchical Storage Management (HSM) to tape. Hadoop Distributed File System (HDFS) connector support. Multi-cluster support (exporting an IBM Spectrum Scale file system from one Spectrum Scale cluster to another IBM Spectrum Scale cluster). 12

GUI. User name space management and quota management. Snapshots and clones. Replication is restricted to only 1X (IBM Spectrum Scale makes a single copy of all data and metadata) and 2X (IBM Spectrum Scale makes two copies of all data and metadata). Additional Limitations Using EBS volume encryption for IBM Spectrum Scale file systems is not supported. The archiving and restoring of IBM Spectrum Scale data through the use of AWS services is not supported. Many of the limitations above are a result of the edition packaged within the AMI (Standard). ATS has created custom cloud formation templates to reflect many of the design considerations called out in this document (third site quorum, maximum data/metadata replicas, splitting of metadata/data volumes, among others) in addition to creating our own AMI images using the Advanced/Data Management edition of Spectrum Scale. Features such as encryption, compression, and AFM are included in these editions. As a next step, we are looking to implement Protocol nodes within AWS, using Active Directory replication as a means for authentication. Implementing an S3 archive tier using Transparent Cloud Tiering would serve as an attractive option for those looking to leverage cheaper storage for archive purposes, all within AWS. We look forward to partnering with our customers to come up with the next generation of Spectrum Scale cluster implementations in the cloud. 13

THEATSGROUP.COM/COMPANY/CONTACT About the ATS Group Since our founding in 2001, the ATS Group has consulted on thousands of system implementations, upgrades, backups and recoveries. We also support customers by providing managed services, performance analysis and capacity planning. With over 60 industry-certified professionals, we support SMBs, Fortune 500 companies, and government agencies. As experts in IBM, VMware, Oracle and other top vendors, we are experienced in virtualization, storage area networks (SANs), high availability, performance tuning, SDS, enterprise backup and other evolving technologies that operate mission-critical systems on premise, in the cloud, or in a hybrid environment.