Accelerate with IBM Storage FlashSystem A9000 / A9000R Technical Update Craig Gordon, Lisa Martinez, Brian Sherman IBM Storage ATS -Washington Systems Center-Storage Steve Solewin Corporate Solutions Architect
Accelerate with IBM Storage Webinars The Free IBM Storage Technical Webinar Series Continues... Washington Systems Center Storage experts cover a variety of technical topics. Audience: Clients who have or are considering acquiring IBM Storage solutions. Business Partners and IBMers are also welcome. How to sign up? To automatically receive announcements of upcoming Accelerate with IBM Storage webinars, Clients, Business Partners and IBMers are welcome to send an email request to accelerate-join@hursley.ibm.com. Information, schedules, and archives: Located in the Accelerate with IBM Storage Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/accelerate/?lang=en Upcoming Webinars June 29 th Accelerate with IBM Storage: DS8880 Thin Provisioning July 20 th Accelerate with IBM Storage: Spectrum Virtualize Encryption July 24 th Accelerate with IBM Storage: RansomWare Protection Using IBM Storage 2
Washington Systems Center Storage Customer demonstrations, Proof of Concepts, performance benchmarks Customer Workshops FlashSystem A9000/A9000R, XIV & Spectrum Accelerate Customer Workshop (Chicago, IL August 30-31) DS8000 Advanced Functions Customer Seminar (Bethesda, MD August 16-17) 3
Agenda Overview and what's new Internal encryption key option HyperSwap Demo of HSM 5.2 HyperSwap 4
FlashSystem A9000 / A9000R What s New 5
Spectrum Accelerate Software Defined Storage (SDS) Intelligent, feature rich and field proven storage software Specifically for A9000/A9000R Data Reduction: Pattern Removal, Deduplication and Compression Cross platform management with Hyper-Scale Manager XIV Gen3 FlashSystem A9000/A9000R Data services Snapshot Sync/Async, 3 site replication (XIV) Thin-Provisioning Hyper-Scale Manager, Mobility and Consistency Encryption Multi-tenancy QoS per host, domain, pool and volume Data Migration Integrations VMWare Microsoft (Hyper-V) (XIV) OpenStack Cinder CAPACITY OPTIMIZED On premises Storage Rich Servers Client choice of x86 hardware Converged / HyperConverged Infrastructure Note: Not all features available in each deployment model PERFORMANCE OPTIMIZED SOFTWARE DEFINED Cloud provider Storage Rich Servers 6
FlashSystem A9000 and FlashSystem A9000R FlashSystem A9000 8U, complete offering Composed of 3 grid controllers and 1 flash enclosure Scalability with Hyper-Scale Manager (HSM) Capacity options FlashSystem A9000R Integrated 42U rack offering Composed of 2 6 Grid Elements Additional scalability with HSM Up to 144 Spectrum Accelerate instances Capacity options Flash Enclosure-60 Flash Enclosure-150 Flash Enclosure-300 Effective Capacity 1 60 TB 150 TB 300 TB FlashSystem 900 Usable capacity 12 TB 29 TB 57 TB 1 Assuming up to a 5.26 to 1 data reduction ratio 29TB FS900 Effective Capacity 57TB FS900 Effective Capacity 2 Grid Elements 58 TB 300 TB 114 TB 600 TB 3 Grid Elements 87 TB 450 TB 171 TB 900 TB 4 Grid Elements 116 TB 600 TB 228 TB 1200 TB 5 Grid Elements 145 TB 750 TB 285 TB 1500 TB 6 Grid Elements 174 TB 900 TB 342 TB 1800 TB 1 Assuming up to a 5.26 to 1 data reduction ratio 7
CVD available for A9000 in a VersaStack Converged Infrastructure Software Defined method of deploying a Storage Infrastructure Field proven enterprise storage software Repurpose x86 infrastructure at will Co-locate compute and storage for a converged solution Scale-out private Cloud with full enterprise storage capability All-Flash Converged Infrastructure: VersaStack Storwize V5030F Storwize V7000F SVC FlashSystem V9000 FlashSystem A9000 Entry Midrange Virtualizing the DC Enterprise Cloud service providers www.ibm.com/versastack www.cisco.com/go/versastack http://www.cisco.com/c/en/us/solutions/data-center-virtualization/versastack-solution-cisco-ibm/index.html#~cvds 8
IBM Epic Relationship Update Epic Epic makes software for mid-size and large medical groups, hospitals and integrated healthcare organizations working with customers that include community hospitals, academic facilities, children's organizations, safety net providers and multi-hospital systems Epic develops all applications in house Based on Caché from Intersystems Epic and IBM Maintain a Close Technical Relationship Thorough understanding of each others products what works, what doesn t Epic uses IBM Benchmark Centers or on-site IBM hardware to collect performance data Continual exchange of information via regular meetings, roadmaps, etc Deep collaboration in resolution of customer issues IBM has a Best Practices Guide for Epic installations Epic is vendor agnostic and does not certify or support vendor hardware technology As of June 5, 2017, Epic included the FlashSystem A9000R in the SpaTS (Storage Performance and Technology Sales Guide), produced by Epic This means that the A9000R is now an accepted storage platform by Epic for broad use by Epic's customers 9 9 9
Orchestration made easy REST API Python Library CLI OpenStack Cinder VMware and Microsoft integration IBM Spectrum Control Base 10
IBM Flash Portfolio and VMware Integration VMware Feature DS8000 XIV FlashSystem A9000R FlashSystem A9000 FlashSystem V9000 Storwize V7000F Storwize V5030F FlashSystem 900 VMware Support VMware Web Client (VWC) SRA for Site Recovery Manager - Storage-based replication VAAI (Hardware Accelerated) Zero Block Zero Full Copy SCSI XCopy ATS Unmap Planned Planned Planned VASA 2.0 VVol Planned Planned vrealize Suite Planned 11
A9000 / A9000R What s New R12.0.3 Non-disruptive addition of grid elements (MES) for FlashSystem A9000R Quality and Performance enhancements Introduced Projected System Capacity in HSM 5.1.1 R12.1 HyperSwap Support Encryption with either local key or external key Quality and performance improvements HSM 5.2 enhancements R12.1 will require Hyper-Scale Manager R5.2+ and Spectrum Control Base Edition R3.2.0+ for VMware integration Knowledge Center Links: A9000: http://www.ibm.com/support/knowledgecenter/stjkmm/landing/ibm_flashsystem_a9000_welcome_page.html A9000R: http://www.ibm.com/support/knowledgecenter/stjkn5/landing/ibm_flashsystem_a9000r_welcome_page.html 12
Hyper-Scale Manager R5.2 What s New HyperSwap GUI support for setup, monitoring and managing Command logging enhanced Use the GUI and log the CLI commands executed Makes it easier to write your automation scripts Ports Statistics Monitor performance of any Host or System port individually Right-click is here Get faster access to any action Efficiency displays improved Changes to Projected System Capacity Volume data reduction values displayed Mapping enhancements Initialize mapping from either your Volume, Host or Cluster views Select multiple volumes and add to a Consistency Group IBM Hyper-Scale Manager A9000 HSM R5.2: https://ibm.co/2swt4dq A9000R HSM R5.2 https://ibm.co/2rt42rf XIV HSM R5.2 https://ibm.co/2scaqf2 13
FlashSystem A9000 / A9000R Encryption 14
A9000 / A9000R Data at Rest Encryption (pre R12.1) Data is encrypted by each MicroLatency module Using AES256 in XTS mode Happens after data reduction occurs Vault devices are also encrypted Accomplished inside each grid controller Hot encryption activation and re-key External key manager required SKLM 2.6+ Other key managers that support KMIP KeySecure Thales Other KMIP compliant key managers available via RPQ All encryption key transactions are managed by the Key Manager and the FlashSystem A9000/R system http://www.redbooks.ibm.com/redpapers/pdfs/redp5402.pdf 15
A9000 / A9000R Data at Rest Encryption (R12.1) Data is encrypted by each MicroLatency module Using AES256 in XTS mode Happens after data reduction occurs Vault devices are also encrypted Accomplished inside each grid controller Hot encryption activation and re-key Option for Internal or External key manager With Internal key management Eliminates the need for purchasing, deploying, and managing a dedicated, independent key management system New Hardware Wrapper Key (HWK) and Random Wrapper Key (RWK) replace the External Key (ESK) Keys stored on local boot devices on each controller rather than in the External Key Manager Recommendation is to utilize an External Key Manager Best security practice is to separate the encrypted data from the encryption key 16
Conversion between Key Management methods Conversion from External Key Management to Local Key Management Conversion is effectively similar to rekeying Random Wrapper Key (RWK) is created (replacing the External Key (ESK)), and is used to encrypt the Primary Key (XMK) XMK that was encrypted with the (no longer-relevant) ESK is deleted A new command was defined: encrypt_change_key_scheme Conversion from Local Key Management to External Key Management Conversion from local key management to external key management Requires encryption to be first disabled and then enabled again (encrypt_disable and then encrypt_enable) Means that all volumes must be deleted 17
Hyper-Scale Manager R5.2 Additional Information 18
HyperSwap 28
HyperSwap What is HyperSwap Key components of the A9000/A9000R HyperSwap solution HyperSwap details 29
What is HyperSwap IBM has long understood the critical requirement to always have access to data and first introduced HyperSwap technology on IBM Storage for zseries in 2002 With the introduction of HyperSwap to the A9000/A9000R family, IBM now provides a Continuous Availability storage system design across the entire IBM Block storage portfolio HyperSwap on the A9000/A9000R is an integrated software feature that allows automatic and non-disruptive failover across synchronous mirroring distances between active-active devices, with the assistance of a network attached quorum witness Based on A9000/A9000R synchronous mirroring Host connects to a HyperSwap volume, which actually resides on both systems HyperSwap relies on Asymmetrical Logical Unit Access (ALUA) support to inform the host about the optimized paths to the storage system, and minimize I/O latency 30
Key components Two IBM FlashSystem A9000 and/or A9000R on 12.1 code or higher Code GA date 9 June No restriction based on machine type Synchronous mirroring distances between systems (up to 100kms) Quorum witness (QW) at third site Free download from FixCentral At least on separate failure domain Strongly suggested a highly available virtual machine Could be in the cloud, as only requires IP connectivity Host(s) with Fibre Channel or iscsi connectivity Appropriate code and patch levels, with multipath drivers (see SSIC) HAK is required (version 2.8 or greater) Portable version of the HAK is available Two IBM FlashSystem A9000 or A9000R storage systems, interconnected for synchronous replication via Fibre Channel 31
Simple configuration Host A Primary Secondary Data center A Data center B Quorum Witness Site C 32
HyperSwap Relationships Every HyperSwap volume or Consistency Group resides on two peer systems that are connected via fibre channel A HyperSwap volume exists on both systems as an active-active pair of volumes One of those volumes is initially designated as Primary and the other as Secondary The purpose of the Primary/Secondary designation is to optimize latency Primary volume should be set at the protection domain that generates most of the I/O Volumes can be resized while in a HyperSwap relationship Actual roles performed by the two peers at any given moment may differ from their designations, as a result of a manual role change or an automatic failover. Multiple HyperSwap volumes can exist on each pair of peer systems An individual volume can be in a Synchronous or Asynchronous or HyperSwap relationship Existing synchronous mirrors can be converted to HyperSwap HyperSwap relationships can be non-disruptively converted back to synchronous mirroring 33
HyperSwap Consistency Groups To be eligible for the addition to a highly available Consistency Group, the volume must confirm to the following requirements: Volume state matches the Consistency Group state Volume is highly available Volume and the Consistency Group are associated with the same Quorum Witness Peer volume and the peer Consistency Group match the original volume and Consistency Group Any change to the Consistency Group, whether automatic or manual, affects all the volumes it contains Automatic failover is carried out only if all the volumes in the Consistency Group are available for the automatic failover 34
Specifications Number of targets: 10 Number of relations: 1536 (Similar to synchronous mirroring) 1536 is total number HyperSwap, sync and async One active quorum witness allowed per system Number of systems per QW: Up to 40 systems are certified for 12.1 35
HyperSwap minimal prerequisites Two A9000\R systems with manufacturing-issued PKCS12 certificates installed FC 4\8\16 GB infrastructure for mirror connectivity At least two FC Paths for each machine (Target & Initiator) for Mirror connectivity Network: All Management ports (three per machine; total of six) configured and connected TCP ports 8460, 8461 and 7778 must be open 36
Quorum Witness requirements Dual vcpu, 4GB memory, 40GB HDD space Will support 20 A9000/A9000R systems Red Hat 6.x/7.x CentOS 6.x/7.x A highly available physical host or virtual machine (VM) is strongly suggested 37
Active-Active configuration A single volume SCSI identity is presented through both storage systems, keeping size, lock state and reservations The same info is presented by the peer volumes Volumes must have the same name Host connects to volume through both systems using multipath, as if connecting to a single volume SCSI ALUA (asymmetric logical unit access) is required to present port group states to the host and signal which paths should be used. The storage systems use SCSI Unit Attentions signal to notify the host of any changes in the port groups state The host experiences a failover as a change in the state of its paths with no impact to applications and no manual intervention 38
Active-Active I/O All write IOs are handled by the Primary storage system Writes received on the Secondary system are redirected to the Primary system for handling Reads received on the Secondary system are handled locally When HA is activated, the Secondary system can perform I/O operations. At this point, there may not be a synchronized data copy, so read requests are redirected to the Primary system until the data is synchronized 39
Normal configuration Hosts use active/optimized paths to send I/O to the volume Host A Host B Hosts can use nonoptimized paths, but I/O is expected only rarely Primary replicates the data to the Secondary Secondary sends writes received on Secondary ports to Primary Primary Secondary Data center A Data center B Quorum Witness Site C 40
Demonstrating HA Logic: Primary failure HA configuration 1. Primary fails All Primary preferred paths are down Mirror connection fails QW connection fails Host A vm1 vm2 1 1 2 Host B 2 2. Automatic change_role 1 Secondary becomes Primary Non-preferred paths are now preferred 3. Host A will have high latency (if DC B is remote) 1 Primary Data center A 1 Quorum Primary 2Secondary Data center B Site C 4. HA is Disabled 41
Demonstrating HA Logic: Mirror failure HA configuration 1. Mirror connection fails 2. Secondary becomes unavailable 3. Non-preferred paths are now unavailable Host A vm1 vm2 2 Host B 2 4. HA is disabled 1 2 Primary Data center A Secondary Data center B Quorum Site C 42
HA Logic: Primary QW & Mirror connections failure 1. Primary to Quorum Witness connection fails 2. Mirror connection fails 3. Primary becomes unavailable All Primary preferred paths are down Host A vm1 vm2 3 4 3 2 Host B 4 4. Secondary becomes Primary (change_role) Non-preferred paths are now preferred 3 Primary Data center A 1 4 Secondary Primary Data center B 5. HA is disabled Quorum Site C 6. Host A has high latency (if DC B is remote) 43
HA Logic: Host failure HA configuration 1. Host A fails Host Host vm1 1 A vm2 1 Host B Host vm1 2 vm2 2. VM1 and VM2 are recovered on Host B 3. Host B has high latency working on Primary 4. HA enabled Primary Secondary Data center A Data center B 5. In case of a long term situation, consider doing switch_role to improve performance Quorum Site C 44
Integrated VMware SRM solution with IBM Spectrum Accelerate Family Storage Replication Adapter (SRA) HyperSwap and SRM Stretched Cluster non-disruptive workload mobility vcenter A SRM A IBM A9000SRA ESXi 1 Virtual machines failover vcenter B SRM B IBM A9000SRA ESXi 2 App 1 App 2 Virtual machines fallback App 3 App 4 VM 1 VM 2 VM 3 VM 4 SAN / iscsi Network Remote Mirroring through FC SAN SAN / iscsi Network IBM A9000. A9000R Protected (primary) Site Quorum Site C IBM A9000. A9000R Recovery (secondary) Site When certified, the package will be available for download from http://www.vmware.com/go/download-srm 45
Snapshot support 1. Local snapshots Snapshot operations are carried out locally at the relevant peer system Snapshot names are local A snapshot can be created on a Secondary volume/cg even if the HyperSwap relation is unsynched 2. HyperSwap snapshots A single command to create a snapshot at the Primary and Secondary containing the same pointin-time The snapshots can have the same user assigned name Once created, the snapshots are independent 3. HyperSwap snapshots are not stretch-snapshots Snapshot volumes SCSI identity are different If one of the snapshots is unlocked, changes are not updated on the other snapshot Implication: any applications that use snapshots, e.g. VSS, will not recover automatically after failover 46
More points 1. Distance between systems affects the expected latency. Recommendation is equivalent to Sync mirror (up to 100kms) The normal I/O path of HyperSwap is equivalent to sync mirror I/O path. Only under some failures, such as loss of host connectivity to the Primary, there should be I/O through the Secondary peer. When the I/O path goes through the Secondary the expected latency is almost double that of sync mirror 2. Connectivity between the storage systems is FC 3. QoS is defined separately in each system according to the local system resources Note that QoS limits Host IO and therefore the Secondary QoS will not limit the Primary performance 47
Writes received on the Secondary Host A Primary Data center A 1 6 2 5 3 4 Secondary Data center B 1. Host writes to Secondary 2. Secondary sends write data to Primary 3. Primary syncs write to Secondary 4. Secondary sends Ack to Primary 5. Primary sends Ack to Secondary 6. Secondary sends Ack to Host 48
Quorum Witness (QW) 1. QW is an application which should be installed in a separate site, or more precisely separate failure domain 2. QW use: QW assists in making the determination that a peer has failed, using keepalive messages sent by the systems QW is used as tie-breaker when the two peer systems both believe they should be the Primary for the relation Required in order to determine which system should own the Primary volume and which one should own the Secondary volume upon failure in order to avoid split-brain scenarios 3. QW is critical for automatic recovery actions. Without it automatic failover is not carried out 4. HA of the Quorum App itself can be achieved by installing it in a highly available VM 49
HyperSwap and HAK HAK version 2.8.0 is required for HyperSwap Run dev_list o all to show new HS fields 50
HyperSwap xcli commands 51
HyperSwap properly configured: shown in GUI 52
53
Questions 54
References Redbook HyperSwap draft out soon Andrew Greenfield HSM videos: IBM Hyper Scale Manager v5.2 Getting Started IBM Hyper Scale Manager v5.2 HyperSwap and Mirroring IBM-XIV Storage Next YouTube Channel 55