UltraPath Technical White Paper

Similar documents
Huawei OceanStor ReplicationDirector Software Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01. Date

HUAWEI OceanStor Enterprise Unified Storage System. HyperReplication Technical White Paper. Issue 01. Date HUAWEI TECHNOLOGIES CO., LTD.

EMC Performance Optimization for VMware Enabled by EMC PowerPath/VE

WHITE PAPER: ENTERPRISE SOLUTIONS. Veritas Storage Foundation for Windows Dynamic Multi-pathing Option. Competitive Comparisons

HUAWEI AR Series SEP Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 1.0. Date

VERITAS Dynamic MultiPathing (DMP) Increasing the Availability and Performance of the Data Path

Huawei SAN Storage Host Connectivity Guide for Red Hat. Issue 04 Date HUAWEI TECHNOLOGIES CO., LTD.

Huawei FusionCloud Desktop Solution 5.3. Branch Technical White Paper. Issue 01. Date HUAWEI TECHNOLOGIES CO., LTD.

The Best Choice for SMBs

Linux Multipathing Solutions: MPP vs. DM-RDAC

Cloud-Oriented Converged Storage

SAN Storage Array Workbook September 11, 2012

AD SSO Technical White Paper

Cloud-Oriented Converged Storage

OceanStor 6800F V5 Mission-Critical All-Flash Storage Systems

VERITAS Dynamic Multipathing. Increasing the Availability and Performance of the Data Path

OceanStor 9000 InfiniBand Technical White Paper. Issue V1.01 Date HUAWEI TECHNOLOGIES CO., LTD.

MPLS OAM Technology White Paper

Veritas Storage Foundation for Windows by Symantec

Huawei FusionCloud Desktop Solution 5.1 Resource Reuse Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01.

S Series Switch. Cisco HSRP Replacement. Issue 01. Date HUAWEI TECHNOLOGIES CO., LTD.

Huawei esight LogCenter Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 1.0. Date PUBLIC

Veritas Storage Foundation for Windows by Symantec

3.1. Storage. Direct Attached Storage (DAS)

Lightning Fast Rock Solid

Elastic Load Balance. User Guide. Issue 01 Date HUAWEI TECHNOLOGIES CO., LTD.

iscsi SAN Configuration Guide

esight V300R001C10 SLA Technical White Paper Issue 01 Date HUAWEI TECHNOLOGIES CO., LTD.

HUAWEI USG6000 Series Next-Generation Firewall Technical White Paper VPN HUAWEI TECHNOLOGIES CO., LTD. Issue 1.1. Date

HP StoreVirtual Storage Multipathing Deployment Guide

Nigel Hone Hitachi Data Systems

BGP/MPLS VPN Technical White Paper

Dell EMC SAN Storage with Video Management Systems

HUAWEI Secospace USG Series User Management and Control White Paper

OceanStor 5300F&5500F& 5600F&5800F V5 All-Flash Storage Systems

SEP Technology White Paper

HUAWEI OceanStor VIS6600T Product Presales Training

Dell EMC Unity Family

Setup for Failover Clustering and Microsoft Cluster Service

Huawei Sx7 Series Switches. SVF Technology White Paper. Issue 01. Date HUAWEI TECHNOLOGIES CO., LTD.

Surveillance Dell EMC Storage with Digifort Enterprise

VMware vsphere with ESX 4 and vcenter

Best Practices of Huawei SAP HANA TDI Active-Passive DR Solution Using BCManager. Huawei Enterprise BG, IT Storage Solution Dept Version 1.

Anti-DDoS. User Guide (Paris) Issue 01 Date HUAWEI TECHNOLOGIES CO., LTD.

Huawei OceanStor 2200 V3 and 2600 V3 Storage Systems. Technical White Paper. Issue 01. Date HUAWEI TECHNOLOGIES CO., LTD.

Virtualization with VMware ESX and VirtualCenter SMB to Enterprise

Dell EMC Unity Family

Veritas Dynamic Multi-Pathing for VMware 6.0 Chad Bersche, Principal Technical Product Manager Storage and Availability Management Group

Live Streaming Accelerator. Quick Start. Issue 03 Date HUAWEI TECHNOLOGIES CO., LTD.

Technical White Paper for NAT Traversal

Exam : S Title : Snia Storage Network Management/Administration. Version : Demo

Huawei FusionSphere 6.0 Technical White Paper on OpenStack Integrating FusionCompute HUAWEI TECHNOLOGIES CO., LTD. Issue 01.

Best Practice of HUAWEI OceanStor T Series Solutions for Key VMware Applications

SAP HANA. HA and DR Guide. Issue 03 Date HUAWEI TECHNOLOGIES CO., LTD.

esdk Storage Plugins 1.0.RC4 Compilation Guide 01(vRO) Issue 01 Date HUAWEI TECHNOLOGIES CO., LTD.

Software-defined Shared Application Acceleration

Huawei Enterprise Network esight Channel Sales Guide HUAWEI TECHNOLOGIES CO., LTD. Issue 3.2. Date

HUAWEI SAN Storage Host Connectivity Guide for VMware ESXi Servers

Surveillance Dell EMC Storage with FLIR Latitude

A Dell Technical White Paper Dell Storage Engineering

DELL EMC UNITY: HIGH AVAILABILITY

Business Continuity and Disaster Recovery. Ed Crowley Ch 12

COPYRIGHTED MATERIAL. Windows Server 2008 Storage Services. Chapter. in this chapter:

Technical Brief GRANITE SNMP ALARMS AND RECOMMENDED ACTIONS. Technical Brief. Riverbed Technical Marketing

Vendor: HuaWei. Exam Code: H Exam Name: HCNP-Storage-CUSN(Constructing Unifying Storage Network) Version: Demo

OceanStor T Series Configuration and Quotation Guide. Security Level:

Object Storage Service. Product Introduction. Issue 04 Date HUAWEI TECHNOLOGIES CO., LTD.

DRH Hardware Maintenance Guide

EMC CLARiiON Asymmetric Active/Active Feature (ALUA)

Dell EMC SC Series Storage with SAS Front-end Support for VMware vsphere

Qsan Document - White Paper. How to configure iscsi initiator in ESXi 6.x

HP StoreVirtual DSM for Microsoft MPIO Deployment Guide

Fibre Channel SAN Configuration Guide Update 2 and later for ESX Server 3.5, ESX Server 3i version 3.5, VirtualCenter 2.5

Technical White Paper iscsi Boot November 11, 2004

espace UMS V100R001C01SPC100 Product Description Issue 03 Date HUAWEI TECHNOLOGIES CO., LTD.

EMC Backup and Recovery for Microsoft Exchange 2007

FUJITSU Storage ETERNUS AF series and ETERNUS DX S4/S3 series Non-Stop Storage Reference Architecture Configuration Guide

Quidway S5700 Series Ethernet Switches V100R006C01. Configuration Guide - Ethernet. Issue 02 Date HUAWEI TECHNOLOGIES CO., LTD.

Veritas Storage Foundation for Windows by Symantec

Microsoft Office SharePoint Server 2007

This guide provides configuration settings and considerations for Hosts running VMware ESXi.

Symantec NetBackup Appliance Fibre Channel Guide

This guide provides configuration settings and considerations for SANsymphony Hosts running VMware ESX/ESXi.

EMC VPLEX Geo with Quantum StorNext

ATTACHMENT A SCOPE OF WORK IMPLEMENTATION SERVICES. Cisco Server and NetApp Storage Implementation

Critical Resource Analysis (CRA) White Paper

Anti-DDoS. FAQs. Issue 11 Date HUAWEI TECHNOLOGIES CO., LTD.

EMC VPLEX with Quantum Stornext

Third-Party Client (s3fs) User Guide

espace SoftConsole V200R001C02 Product Description HUAWEI TECHNOLOGIES CO., LTD. Issue 01 Date

Maintaining End-to-End Service Levels for VMware Virtual Machines Using VMware DRS and EMC Navisphere QoS

Virtualization can provide many advantages in enterprise

HUAWEI Enterprise Solution White Paper

HP StoreVirtual Storage Multi-Site Configuration Guide

IBM Magstar Tape Drives -- AIX High Availability SAN Failover for 3590

Virtual Private Cloud. User Guide. Issue 21 Date HUAWEI TECHNOLOGIES CO., LTD.

Energy Saving Technology White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01. Date

Disk Storage Systems. Module 2.5. Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved. Disk Storage Systems - 1

EMC Backup and Recovery for Microsoft SQL Server

CONTENTS. 1. Introduction. 2. How To Store Data. 3. How To Access Data. 4. Manage Data Storage. 5. Benefits Of SAN. 6. Conclusion

Transcription:

HUAWEI OceanStor Enterprise Unified Storage System Issue 01 Date 2014-04-02 HUAWEI TECHNOLOGIES CO, LTD

Copyright Huawei Technologies Co, Ltd 2014 All rights reserved No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co, Ltd Trademarks and Permissions and other Huawei trademarks are trademarks of Huawei Technologies Co, Ltd All other trademarks and trade names mentioned in this document are the property of their respective holders Notice The purchased products, services and features are stipulated by the contract made between Huawei and the customer All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied The information in this document is subject to change without notice Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied Huawei Technologies Co, Ltd Address: Website: Huawei Industrial Base Bantian, Longgang Shenzhen 518129 People's Republic of China http://enterprisehuaweicom 2

UltraPath Technical White Paper Contents Contents 1 List of Abbreviations 1 2 Multipathing Software 2 21 Single Points of Failure and Multipathing 2 22 Overview of Mainstream Multipathing Software 3 23 Integrating Multipathing Software with Operating Systems 5 3 Working Principle and Functions of UltraPath 7 31 Typical Networking Diagrams 7 311 Direct-Connection Networking 7 312 Dual-Switch Connection Networking 8 32 Working Principle 8 321 Integrating UltraPath with Operating Systems 8 322 How to Identify and Manage Paths 10 323 How to Identify and Manage Path Groups 12 33 Functions of UltraPath 13 331 Failover and Failback 13 332 Path Test 14 333 Load Balancing 14 4 Highlights of UltraPath 16 411 All Paths Down Protection 16 412 Isolation of Intermittently Faulty Paths 17 413 Isolation of Links That Have Bit Errors 17 414 Path Exception Alarming 18 415 Automatic Host Registration 19 416 Path Disabling 20 417 Path Performance Monitoring and Status Statistics 20 iii

1 List of Abbreviations 1 List of Abbreviations Abbreviation Full Spelling Description MPIO Multipath Input Output The native multipath framework of mainstream operating systems is called MPIO MPIO allows storage vendors to develop multipath solutions by inserting adaptation modules to optimize compatibility with their storage arrays PCM Path Control Module Adaptation module of the MPIO framework in AIX DSM Device-Specific Module Adaptation module of the MPIO framework in Windows PSA Pluggable Storage Architecture Multipath framework of the VMware ESX platform MPP MultiPath Plug-In Multipath plug-in of the VMware ESX platform NMP Native Multipath Plug-In Native multipath plug-in of the VMware ESX platform NMP is a scalable multipath architecture It can integrate the SATP and PSP plug-ins SATP Storage Array Type Plug-In Storage array type plug-in that is used to adapt to arrays of different vendors or models, and masks the differences between arrays from the NMP PSP Path Selection Plug-Ins Path selection plug-in, which implements different path selection algorithms 1

2 Multipathing Software 2 Multipathing Software 21 Single Points of Failure and Multipathing A single point failure means that a certain point of a network is faulty, which may cause network breakdown In the following figure, a single point of failure may occur at: 1 The path between the external network and the application server 2 Application server 3 The path between the server and the storage controller 4 The storage controller 5 The path between the storage controller and disks 6 Disks Figure 2-1 Single-path network To prevent single points of failure, high-reliability systems implement redundant backup for devices that may suffer single points of failure and adopt a cross-connection method to 2

2 Multipathing Software achieve optimal reliability Meanwhile, redundant paths assist in achieving high performance, as shown in the following figure Server Switch Switch Controller Controller Storage Multipathing software ensures reliability of redundant paths If a path fails or cannot meet the performance requirement, multipathing software automatically and transparently delivers I/Os to other available paths to ensure that I/Os are transmitted effectively and reliably As shown in the following figure, multipathing software can handle host bus adapter (HBA) faults, link faults, and controller faults 22 Overview of Mainstream Multipathing Software Currently, multipath solutions provided by storage vendors are classified into the following types: 3

2 Multipathing Software 1 Use self-developed multipathing software, for example, EMC PowerPath, HDS HDLM, and Huawei UltraPath 2 Provide storage adaptation plug-ins based on the multipath framework of operating systems, for example, IBM and HP 3 Use multipathing software built in operating systems (generally, A-A arrays or A-A/A arrays that support ALUA support multipathing software built in operating systems) For details about mainstream multipathing software, see the following table (the mainstream operating systems of x86 servers, midrange computers, and virtualization platforms are respectively Windows and Linux, AIX, and VMware ESX): Item Windows Linux AIX VMware ESX Multipathing software built in operating systems MPIO DM-Multipath MPIO NMP Huawei UltraPath UltraPath UltraPath PCM (based on MPIO) UltraPath EMC PowerPath PowerPath PowerPath PowerPath IBM 1 SDDDSM (based on MPIO) DM-Multipath SDDPCM (based on MPIO) NMP RDAC RDAC RDAC HP 2 SecurePath SecurePath SecurePath DSM (based on MPIO) HP-DM HP-PCM (based on MPIO) NMP HDS HDLM HDLM HDLM HDLM NetApp DSM (based on MPIO) DM-Multipath MPIO NMP Veritas DMP DMP DMP DMP RDAC is gradually replaced with SDD (an operating system framework plug-in) SecurePath is gradually replaced with operating system framework plug-ins Multipathing software built in operating systems (often called MPIO) support the failover and load balancing functions and can cope with scenarios that have moderate requirements on reliability Multipathing software developed by storage vendors is more professional and delivers better reliability, performance, maintainability, and storage adaptation 4

2 Multipathing Software 23 Integrating Multipathing Software with Operating Systems Multipathing software is a type of filter driver software running in host kernel mode It can block and process native disk creation/deletion and I/O delivery of operating systems The following figure shows the layer where the multipathing software resides Application File system LVM Disk Multipath SCSI layer HBA driver Note: Generally, multipathing software works under the disk drive layer to provide virtual disks for upper-layer application Sometimes, the software works above the disk drive layer but under the LVM layer (for example, mainstream multipathing software provided by the Linux platform) The following table lists the advantages and disadvantages of integrating multipathing software with operating systems in various modes 5

2 Multipathing Software Implementation Method Masking native disks and creating virtual SCSI disks Not masking native disks and creating additional virtual disks Advantage 1 Secure Users cannot use native redundant disks, preventing data corruption caused by misoperations 2 Transparent to upper-layer applications Disks remain unchanged before and after multipathing software is deployed Therefore, the configurations of upper-layer applications do not need to be modified 3 Better compatibility Some applications can identify only standard SCSI disks Better compatibility with third-party drivers Disadvantage Non-native multipathing software of operating systems needs operating system driver stacks to support the filter driver mechanism to mask native disks Otherwise, some operating system kernel functions must be hooked, causing coexistence with the third-party driver 1 Users can use native redundant disks Data corruption may be caused due to misoperations 2 The configurations of upper-layer applications must be updated 3 Incompatibility occurs UltraPath masks native disks to ensure data security The following table lists disks created using the preceding two implementation methods when the Linux platform is used Single-Path Network Without Multipathing Software Dual-Path Network Without Multipathing Software Dual-Path Network with DM-MultiPa th Dual-Path Network with UltraPath /dev/sdb /dev/sdb /dev/sdb /dev/sdb /dev/sdc sdb and sdc correspond to the same LUN Therefore, concurrent access may cause data corruption /dev/sdc /dev/mpatha It is secure to use mpatha, but users may use sdb and sdc It is the same as that on a single-path network 6

3 Working Principle and Functions of UltraPath 3 Working Principle and Functions of UltraPath 31 Typical Networking Diagrams 311 Direct-Connection Networking 7

3 Working Principle and Functions of UltraPath 312 Dual-Switch Connection Networking UltraPath allows application servers to connect to storage arrays through Fibre Channel, FCoE, and iscsi ports 32 Working Principle 321 Integrating UltraPath with Operating Systems UltraPath is a type of filter driver software running in host kernel mode It can block and process native disk creation/deletion and I/O delivery of operating systems The following figure shows the layer where the UltraPath driver resides in Windows, Linux, and Solaris 8

3 Working Principle and Functions of UltraPath Application File system LVM Disk Multipath SCSI layer HBA driver On the AIX and VMware ESX platforms, UltraPath is implemented based on the multipath framework of operating systems UltraPath for AIX is a kernel driver developed based on the AIX MPIO mechanism MPIO is a device path management framework introduced in AIX 52 TL04, AIX 53, and later versions to support multipath connection between a storage system and a host MPIO is presented as a device on the host MPIO employs Path-Control Modules (PCMs) to implement multipath management, such as path adding or deleting, I/O path selection, path detection, and failover The following figure shows the relationship among the MPIO, PCM, and system Figure 3-1 Relationship among the AIX MPIO, PCM, and system Application File system Volume manager SCSI driver MPIO HBA driver Device adding Device deletion Path adding Path deletion Path selection (UltraPath) PCM KE UltraPath for VMware is a Multipath Plug-In (MPP) implemented based on the pluggable storage architecture (PSA) in VMware vshpere/esxi UltraPath senses path creation or deletion through functions registered with the PSA and manages the paths 9

3 Working Principle and Functions of UltraPath UltraPath invokes functions provided by the PSA to create or delete virtual logical devices (that is, devices whose multiple paths are aggregated) UltraPath receives I/Os from upper-layer applications through functions registered with the PSA UltraPath invokes functions provided by the PSA to send I/Os ESX Server VM VM VM VM VM VM kernel Pluggable storage architecture PowerPath UltraPath NMP SATP AA SATP ALUA SATP HDLM On the VMware ESX platform, UltraPath can work together with NMP and other PSA-based multipathing software (such as EMC PowerPath) 322 How to Identify and Manage Paths Path Each SCSI device has a SCSI address, which consists of an initiator ID (or host ID), bus ID, target ID, and LUN In actual networking, the initiator ID refers to the HBA port ID, and the target ID refers to the controller port ID of a storage array (the bus ID can be the ID of an old parallel SCSI bus, which is always 0 on a SAN) As shown in the following figure, two HBA ports are connected to four controller ports Two LUNs of the storage array are mapped to the host Therefore, eight SCSI devices are generated on the host and their SCSI addresses are: I:1-B:0-T:1-L:1 (that is, initiator ID = 1, bus ID = 0, target ID = 1, and LUN = 1) I:1 - B:0 - T:1 - L:2 I:1 - B:0 - T:3 - L:1 I:1 - B:0 - T:3 - L:2 I:2 - B:0 - T:2 - L:1 I:2 - B:0 - T:2 - L:2 I:2 - B:0 - T:4 - L:1 I:2 - B:0 - T:4 - L:2 10

3 Working Principle and Functions of UltraPath Server HBA 2 HBA 1 Switch Switch Port 1 Port 2 Port 3 Port 4 Controller B Controller A LUN 1 LUN 2 Storage Different SCSI devices mean using different paths to access different logical storage units Therefore, UltraPath abstracts paths based on SCSI devices That is, one path points to one SCSI device Note: As paths managed by UltraPath are associated with LUNs, UltraPath can process link down failures between the host and storage array and failures to access a specific LUN through a controller (for example, the storage array can use controller B instead of controller A to access LUN 1) Physical path SCSI devices with the same initiator ID and target ID use the same I_T connection to access a storage array Therefore, UltraPath abstracts physical paths to associate these SCSI devices On the previous network, four physical paths can be virtualized: I:1 - T:1 I:1 - T:3 I:2 - T:2 I:2 - T:4 Mapping between virtual disks and paths Each LUN has a unique World Wide Name (WWN) UltraPath uses WWNs to determine whether SCSI devices are LUNs or paths of a LUN UltraPath creates virtual disks that correspond to LUNs, provides the virtual disks for upper-layer applications, and maintains the mapping between the virtual disks and paths As shown in the following figure, two paths exist between the storage array and the host, three LUNs are mapped to the host, and six SCSI devices are generated on the host UltraPath determines that the six SCSI devices are from three LUNs through WWNs and creates three virtual disks, each of which has two paths 11

3 Working Principle and Functions of UltraPath 323 How to Identify and Manage Path Groups Path group As a service processing unit of a storage array contains one or more controllers (or called storage processors) and paths belonging to different controllers are associated, UltraPath abstracts a path group to manage paths As shown in the following figure, paths 1 and 2 belong to controller A, and paths 3 and 4 belong to controller B, forming two path groups Path group 1 Server HBA 2 HBA 1 Switch Switch Path 1 Path 2 Path 3 Path 4 Controller B Controller A LUN 1 Storage Path group 2 Active-active/Asymmetric array and path group priority 12

3 Working Principle and Functions of UltraPath For an active-active/asymmetric array, each LUN can be accessed by any controller The access efficiencies using different controllers are different Generally, one LUN has a preferred controller The efficiency of accessing the LUN using its preferred controller is the highest Therefore, UltraPath identifies the preferred controller of a LUN and accesses the LUN through the preferred controller As shown in the following figure, the preferred controller of LUN 1 is controller A UltraPath selects paths 1 and 2 in path group 1 to access LUN 1 Path group 1 Server HBA 2 HBA 1 Switch Switch Path 1 Path 2 Path 3 Path 4 Controller B Controller A LUN 1 Storage Path group 2 33 Functions of UltraPath 331 Failover and Failback Failover UltraPath enables paths to fail over to functional paths The following figure shows the failover process 1 An application delivers I/Os to the virtual disk created by UltraPath 2 UltraPath forwards the I/Os to path 1(a SCSI device) 3 I/Os on path 1 fail due to a path fault 4 UltraPath delivers I/Os to path 2 5 Path 2 returns an I/O success acknowledgement 6 UltraPath returns an I/O success acknowledgement to the application 13

3 Working Principle and Functions of UltraPath 1 Application UltraPath 6 2 3 4 5 Path 1 HBA Path 2 HBA Storage 332 Path Test In step 3, the HBA tries reconnection for a period of time after a path is faulty During the period of time, I/Os remain in the HBA instead of returning back to UltraPath For this reason, I/Os are blocked for a period of time during the failover Failback 333 Load Balancing UltraPath automatically delivers I/Os to a path again after the path recovers from a fault There are two methods to recover a path: 1 For a hot-swappable system (for example, Windows), the SCSI device will be deleted if the link between a host and a storage array is down After the link is recovered, a SCSI device will be created UltraPath can immediately detect the path recovery 2 For a non-hot-swappable system (for example, AIX or earlier versions of Linux), UltraPath periodically tests and detects the path recovery UltraPath tests the following paths in routine manner: 1 Faulty paths UltraPath tests faulty paths with a high frequency to detect the path recover as soon as possible 2 Available but idle paths UltraPath tests idle paths to identify faulty paths in advance, preventing unnecessary I/O retires The test frequency is kept low to minimize impact on service I/Os Load balancing is another critical feature of UltraPath Load balancing enables UltraPath to use the bandwidth of multiple links to improve overall system throughput UltraPath supports three load balancing algorithms: Round-robin: I/Os are evenly distributed onto each path 14

3 Working Principle and Functions of UltraPath Least-I/O: The number of queuing I/Os on each path is obtained and new I/Os are delivered to the path with the least queuing I/Os Least-block: The total amount of data on each path is obtained based on the least-i/o algorithm and block size of each I/O request, and new I/Os are delivered to the path with the least amount of data Paths involved in load balancing For active-active arrays, UltraPath implements load balancing among all paths For active-active/asymmetric arrays, UltraPath implements load balancing in path groups As shown in the following figure, I/Os from LUN A are delivered to paths 1 and 2, and I/Os from LUN B are delivered to paths 3 and 4: 15

4 Highlights of UltraPath 4 Highlights of UltraPath 411 All Paths Down Protection In single-path networking, the HBA or iscsi initiator tries reconnection for period of time (30s or 60s) after a link is down If the reconnection succeeds, I/Os continue to be sent through the link In multi-path networking, as a standby path is available, the retry period of the HBA or iscsi will be shortened to allow UltraPath to detect I/O failure and implement failover quickly, shortening the I/O blocking time Application Application UltraPath Delivers I/Os Delivers I/Os Reports an I/O 1 2 1 2 Reports an I/O 3 failure 60s later failure 5s later HBA HBA 1 HBA 2 Delivers I/Os using another path Storage Storage If an All Paths Down (APD) issue occurs (usually single points of failure occur), service I/Os may fail immediately and restarting switches will cause service interruption, as shown in the following figure Server Switch Controller Controller Storage 16

4 Highlights of UltraPath UltraPath implements a reconnection mechanism If an APD issue occurs, UltraPath suspends I/Os and tries to recover the path for period of time If at least one path is recovered during the period of time, the services can be recovered 412 Isolation of Intermittently Faulty Paths Application scenario Links are intermittently disconnected due to poor link quality or incorrect connection Operations of UltraPath without isolation mechanisms If a link is down, UltraPath delivers I/Os to another link and switches I/Os back after the faulty link is recovered The process repeats Impact on services without isolation mechanisms If I/Os fail due to a link down failure, it takes time to switch I/Os to another path, causing I/O blocking If I/O blocking occurs frequently and repeatedly, the performance of upper-layer services deteriorates or even services sensitive to I/O latency fail UltraPath uses the following mechanisms to isolate paths that are intermittently faulty: A path will not be used immediately after recovering from a fault, but stays in the standby state for tests I/Os will not be switched back to the path until the path passes the tests Times that a path fails are monitored UltraPath sets the status of a path that fails for several times to degraded, as shown in the following figure A degraded path is not unavailable but will not be used preferentially The path will be used if no optimal path is available A path is faulty for the first time, and services fail over to another functional path The path is faulty for n After the path is The path is times and the status of recovered, faulty for the path is set to services fail the second degraded Services fail back to the time and is over to another functional original path recovered path After the path is recovered, services do not fail back to the original path Fault statistics period Starting statistics on fault times Stopping statistics on fault times 413 Isolation of Links That Have Bit Errors Application scenario If the quality of a Fibre Channel link is poor, for example, an optical fiber is folded or damaged or the power of an optical module decreases, bit errors may occur in the link Generally, bit errors do not cause link down, but I/Os time out or fail with a probability Operations of UltraPath without isolation mechanisms 17

4 Highlights of UltraPath UltraPath retries if I/O errors occur As I/Os on a link that has bit errors fail with a probability, the retry can succeed Therefore, UltraPath does not set the status of the link to faulty As links that have bit errors are congested more heavily, the least-i/o load balancing algorithm lets UltraPath deliver fewer I/Os to the links to reduce the impact on services However, some I/Os will be delivered to the links Impact on services without isolation mechanisms I/Os time out or fail with a probability Although I/Os are successfully delivered after several retires, I/Os are blocked during the reties As a result, the performance of upper-layer services deteriorates or even services sensitive to I/O latency fail I/O timeout causes 60-second blocking of upper-layer services I/Os that occasionally fail have an uncertain impact on upper-layer services In the worst cases, the performance of upper-layer services drops by more than 90% UltraPath employs multiple mechanisms to efficiently isolate links that have bit errors UltraPath identifies unhealthy paths based on the I/O latency, failure rate, and collected statistics and sets their status to degraded These paths will not be used preferentially As I/Os on links that have bit errors fail with a probability, these I/Os must be tested strictly for a long time UltraPath constructs I/Os sensitive to bit errors to continuously check degraded paths, ensuring that links that have bit errors are isolated The frequency of bit error test bit streams is limited to prevent impact on applications 414 Path Exception Alarming If a path fails, UltraPath automatically switches service I/Os to another path to ensure service continuity However, this is insufficient System reliability decreases and risks of single points of failure exist As shown in the following figure, when one switch is faulty, the other switch and the optical fiber between the host and switch become the single points Server Switch Switch Controller Controller Storage In this case, maintenance personnel must recover the faulty component as soon as possible to restore system reliability UltraPath allows path-related alarms to be sent to Huawei storage arrays in in-band mode Then the storage arrays centrally manage the alarms and send the alarms by emails or short messages, or notify the third-party network management software 18

4 Highlights of UltraPath Server Pushing alarms using an available path Switch Switch Storage Short messages Emails ISM SNMP The alarm mechanism of UltraPath has the following characteristics: No socket-based agents need to be installed on service hosts No network ports are occupied No security vulnerabilities exist No SNMP traps need to be configured on service hosts This simplifies configurations when a large number of service hosts are deployed UltraPath supports the following path exceptions: Paths are unavailable Paths are degraded (links have bit errors or are intermittently faulty) No redundant controllers are connected to paths (including that the initial networking is incorrect) The alarm function of UltraPath is designed for scenarios where system reliability decreases, not for scenarios where all paths are faulty When all paths are faulty, services are interrupted or stopped 415 Automatic Host Registration After a storage array is connected to a host, host-related configuration must be performed The host information pushing function of UltraPath can help with the configuration, simplifying the configuration and reducing configuration errors To manually register a host with a storage array, perform the following steps (connecting storage through iscsi on the Linux platform is used as an example): 1 On the ISM, manually create a logical host named webserver1 for the storage array and set the host type to Linux 2 Log in to the service host and query and record the iscsi initiator IQN: # cat /etc/iscsi/initiatornameiscsi InitiatorName= iqn1996-04desuse:01:d9a41f41c999tt 3 On the ISM, select port iqn1996-04desuse:01:d9a41f41c999tt recorded in step 2 and add it for logical host webserver1 The procedure for automatically registering a host using UltraPath is as follows: 1 UltraPath uses in-band commands to send information such as the host operating system type (Linux), host name (webserver1), IP address, and UUID to Huawei storage arrays over all links 19

4 Highlights of UltraPath 416 Path Disabling 2 Logical host webserver1 is automatically created based on the information sent by UltraPath The host type is automatically set to Linux All initiator ports that send the host information are automatically added to logical host webserver1 Comparison between the two methods: 1 The traditional method is complicated especially when a large number of hosts or ports exist Meanwhile, configuration errors may occur, such as host type selection errors and initiator port adding errors 2 UltraPath-based automatic host registration can be implemented without manual operation In addition, the configuration information is consistent with the physical environment Application scenario Replacement of planned components, such as HBAs, optical modules, and interface modules Impact when no manual path disabling is available If a component is directly removed, I/Os may fail Although UtraPath can switch I/Os to other paths, I/O blocking exists, affecting upper-layer services Some HBAs can be securely removed only when they are not working UltraPath supports the following path disabling: 1 Disable a specified path group when a controller is replaced or repaired 2 Disable a specified physical path that is identified by the HBA port plus target port ID 417 Path Performance Monitoring and Status Statistics UltraPath supports various path performance monitoring and status statistics functions: LUN- or path-based performance monitoring Monitoring for IOPS, bandwidth, and latency Separate monitoring for reads and writes Statistics on the numbers of queuing I/Os, I/O retries, and faults 20