Express5800/A2040c, A2020c, A2010c, A1040c PCIe Live Error Recovery User s Guide (Release 1.0)

Similar documents
NEC ESMPRO Agent Extension Installation Guide

NEC ESMPRO Agent Extension

NEC Express5800 Series. StorView Ver. 3. Installation Guide for N F. [Linux Server Edition] A

NEC Express5800/B120f-h System Configuration Guide

Red Hat JBoss Enterprise Application Platform 7.0

Red Hat JBoss Enterprise Application Platform 7.2

ExpressCluster X 2.0 for Linux

NEC EXPRESS5800/E110d-1 Configuration Guide

ExpressCluster X 3.2 for Linux

NEC ExpressUpdate Agent

Global Array Manager

ExpressCluster X 3.1 for Linux

Lifecycle Controller 2 Release 1.0 Version Readme

NEC ExpressUpdate Functions and Features. September 20, 2012 Rev. 4.0

Converged Network Adapter NDIS Miniport Driver for Windows. Table of Contents

ExpressCluster for Linux Version 3 Web Manager Reference. Revision 6us

NEC EXPRESS5800/R320a-E4 Configuration Guide

Hitachi Storage Adapter for SAP NetWeaver Landscape Virtualization Management v Release Notes

NEC Express5800/B120e-h System Configuration Guide

PRIMECLUSTER GDS Snapshot 4.3A20. Installation Guide. Linux

NEC EXPRESS5800/R140b-4 Configuration Guide

NEC Express5800/B110d Configuration Guide

NEC EXPRESS5800/R320a-E4 Configuration Guide

ETERNUS SF AdvancedCopy Manager V15.0. Quick Reference

Hitachi Compute Blade 2500 Intel LAN Driver Instruction Manual for SUSE Linux Enterprise Server

NEC Express5800/B120d-h System Configuration Guide

Hitachi Storage Adapter for SAP Landscape Virtualization Management v Release Notes

NEC EXPRESS5800/R320b-M4 Configuration Guide

EXPRESSCLUSTER X SingleServerSafe 3.3 for Linux. Installation Guide. 01/29/2016 4th Edition

ETERNUS SF Express V15.3/ Storage Cruiser V15.3/ AdvancedCopy Manager V15.3. Migration Guide

NEC EXPRESS5800/R320b-M4 Configuration Guide

Z-Drive R4 and 4500 Linux Device Driver

PowerEdge FX2 - Upgrading from 10GbE Pass-through Modules to FN410S I/O Modules

Cluster Configuration Design Guide (Linux/PRIMECLUSTER)

Managing Zone Configuration

NEC ESMPRO Manager Ver.6 Setup Guide

EXPRESSCLUSTER X 3.3 for Linux

SystemManager G 8.0 WebConsole Option

FUJITSU Storage ETERNUS SF Storage Cruiser V16.3 / AdvancedCopy Manager V16.3. Cluster Environment Setup Guide

NEC Express5800/E120b-1 Configuration Guide

Interstage Shunsaku Data Manager Using the Shunsaku Manuals

Application Guide. Connection Broker. Advanced Connection and Capacity Management For Hybrid Clouds

Notice for Express Report Service (MG)

Veritas System Recovery 18 Linux Edition: Quick Installation Guide

Parallels Containers for Windows 6.0

Veritas NetBackup Appliance Fibre Channel Guide

EXPRESSCLUSTER X SingleServerSafe 3.3 for Linux. Installation Guide. 07/03/2015 3rd Edition

ETERNUS SF Express V15.1/ Storage Cruiser V15.1/ AdvancedCopy Manager V15.1. Migration Guide

Oracle Enterprise Manager Ops Center

NEC ESMPRO Manager Ver.6 Setup Guide

MasterScope Virtual DataCenter Automation Media v3.0

Oracle Enterprise Manager Ops Center. Introduction. What You Will Need. Configure and Install Root Domains 12c Release 3 (

Key Steps for AhsayOBS to AhsayCBS Upgrade and Data Migration

Getting Started with. Agents for Unix and Linux. Version

NEC ESMPRO AlertManager User's Guide (Windows)

Lifecycle Controller Part Replacement

NEC EXPRESS5800/GT110d-S Configuration Guide

ExpressCluster for Linux Ver3.0

DSI Optimized Backup & Deduplication for VTL Installation & User Guide

Oracle Enterprise Manager Ops Center. Introduction. Creating Oracle Solaris 11 Zones 12c Release 2 ( )

Veritas System Recovery 16 Management Solution Administrator's Guide

Red Hat Enterprise Virtualization 3.6 Introduction to the User Portal

Red Hat Enterprise Virtualization 3.6

FC SAN Boot Configuration Guide

Avigilon Control Center System Integration Guide

SunDual Port 4x QDR IB Host Channel Adapter PCIe ExpressModule

FUJITSU Storage ETERNUS SF Express V16.5 / Storage Cruiser V16.5 / AdvancedCopy Manager V16.5. Migration Guide

FUJITSU Storage ETERNUS SF Express V16.3 / Storage Cruiser V16.3 / AdvancedCopy Manager V16.3. Migration Guide

ISRXM64LE93-1. NEC Storage ReplicationControl FileSystem Option on Linux Ver9.3 Installation Guide

White paper PRIMEQUEST 1000 series high availability realized by Fujitsu s quality assurance

Polycom RealPresence Resource Manager System

SECURE Gateway with Microsoft Azure Installation Guide. Version Document Revision 1.0

Veritas System Recovery 18 Management Solution Administrator's Guide

Novell Open Workgroup Suite Small Business Edition

Parallels Virtuozzo Containers 4.6 for Windows

Dell PowerVault Network Attached Storage (NAS) Systems Running Windows Storage Server 2012 Troubleshooting Guide

Overview. Implementing Fibre Channel SAN Boot with the Oracle ZFS Storage Appliance. January 2014 By Tom Hanvey; update by Peter Brouwer Version: 2.

Dual/Quad Channel 4-Port USB 3.0. PCI Express Card

Migrate Cisco Prime Collaboration Assurance

Avigilon Control Center Server User Guide

Oracle Enterprise Manager Ops Center. Overview. What You Need. Create Oracle Solaris 10 Zones 12c Release 3 ( )

Hitachi Block Storage Driver for OpenStack v Release Notes

Clearswift SECURE Gateway Installation & Getting Started Guide. Version 4.3 Document Revision 1.0

Red Hat Development Suite 2.2

Lifecycle Controller Part Replacement

Oracle Enterprise Manager Ops Center. Prerequisites. Installation. Readme 12c Release 2 ( )

Emulex Universal Multichannel

NEC EXPRESS5800/R320d-E4 Configuration Guide

NEC Express5800/GT110e Configuration Guide

Red Hat Enterprise Virtualization 3.6

Oracle Enterprise Manager Ops Center. Introduction. Creating Oracle Solaris 11 Zones Guide 12c Release 1 ( )

NEC Express5800/GT110e-S Configuration Guide

ExpressCluster X SingleServerSafe 3.2 for Linux. Installation Guide. 02/19/2014 1st Edition

FUJITSU Storage ETERNUS SF Storage Cruiser V16.5 / AdvancedCopy Manager V16.5. Cluster Environment Setup Guide

Overview 1 Preparing for installation 2

Virtuozzo Automator 6.1

Red Hat Development Suite 2.1

EXPRESSCLUSTER X 4.0 for Linux

Avigilon Control Center Server User Guide

Hitachi Storage Plug-in for VMware vcenter v Release Notes

Transcription:

Express5800/A2040c, A2020c, A2010c, A1040c PCIe Live Error Recovery User s Guide (Release 1.0) June 2015 NEC Corporation 2015 NEC Corporation 855-901079-001-A

Notes on Using This Manual No part of this manual may be reproduced in any form without the prior written permission of NEC Corporation. The contents of this manual may be revised without prior notice. The contents of this manual shall not be copied or altered without the prior written permission of NEC Corporation. Trademarks Linux is a trademark or registered trademark of Linus Torvalds in Japan and other countries. Red Hat and Red Hat Enterprise Linux are trademarks or registered trademarks of Red Hat, Inc. in the United States and other countries. Oracle is a registered trademark of Oracle Corporation or its subsidiaries, and/or its affiliates in the United States and other countries. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation in the United States and other countries. All other product, brand, or trade names used in this publication are the trademarks or registered trademarks of their respective trademark owners. Related Documents Express5800/A1040c, A2040c, A2020c, A2010c User s Guide

Contents 1. Introduction... 1 1.1 What is PCIe Live Error Recovery?... 1 1.2 Operating Environment... 1 1.3 Supported Cards... 1 1.4 Terminology... 2 1.5 Access Limitation... 2 2. Installing necpciras... 3 2.1 Installing necpciras... 3 2.2 Uninstalling necpciras... 3 2.3 Upgrading necpciras... 3 2.4 Configuration by necpciras... 5 2.5 Backup Configuration Information... 5 3. Necpciras Command Reference... 7 3.1 necpciras command line format... 7 3.2 --show option... 7 3.3 --set-ler option... 8 3.4 --set-noler option... 9 3.5 --set-threshold option... 10 3.6 --reset option... 11 3.7 --version option... 12 3.8 Usage... 12

1. Introduction 1.1 What is PCIe Live Error Recovery? PCIe Live Error Recovery is a feature to improve the I/O availability. In the event of a critical/uncorrectable failure occurs to an adapter, the feature will bring down the PCIe link associated with the failed root port within one cycle and automatically reinitialize the adapter in the case of the intermittent failure to maintain. Without this feature, if a critical I/O failure occurs to the adapter, the system will be down. This feature improves more the I/O availability by a combination of redundant I/O features such as NIC Teaming. 1.2 Operating Environment PCIe Live Error Recovery operating environment as shown below: Hardware (Server) Table 1-1 Operating Environment Express5800/A2040c Express5800/A2020c Express5800/A2010c Express5800/A1040c OS Red Hat Enterprise Linux 6.6 1.3 Supported Cards PCIe Live Error Recovery supported cards as shown below: Table 1-2 Supported Cards Network Card Fibre Channel Card 10GBASE (SFP+/2ch) -NE3304-149 Fibre Channel Controller (1ch,8G) -NE3390-159 Fibre Channel Controller (2ch,8G) -NE3390-160 Fibre Channel Controller (1ch,16G) -NE3390-157A Fibre Channel Controller (2ch,16G) -NE3390-158A 1

1.4 Terminology Terms used in Mission Critical I/O Failover as shown below: Table 1-3 Terminology Term Bonding SPS Failover LER mode NoLER mode LER / NoLER slot Web console necpciras Description Bonding is standard NIC teaming in Linux. StoragePathSavior(SPS) is a software to multiplex paths between a server and storage unit in a system with Express5800 and the NEC Storage series Disk Array Subsystem. Traffic failover to prevent connectivity loss in the event of a network component failure. LER mode is Live Error Recovery mode. Setting LER mode enables PCIe Live Error Recovery. When uncorrected error is occurred in the PCIe slot set as LER, the feature will bring down the PCIe link, automatically reinitialize the adapter in the case of the intermittent failure. Setting NoLER mode disables Mission Critical I/O Failover. When uncorrected error is occurred in the PCIe slot set as NoLER, the system will be rebooted. LER slot is the PCIe slot set as LER. NoLER slot is the PCIe slot set as NoLER. A tool used to view or configure the server via web browser provided by EXPRESSSCOPE Engine SP3. Command used for configuring LER mode. 1.5 Access Limitation Operation related to Mission Critical I/O Failover feature is allowed for the user having administrative right (Administrator account). 2

2. Installing necpciras This section describes how to install, uninstall, and upgrade necpciras command. 2.1 Installing necpciras 1. Login to the target machine as a root user. 2. Copy the file necpciras-*.x86_64.rpm to desired directory in target machine. (* represents revision number.) # rpm -ivh necpciras-2.4-1.02.el6.x86_64.rpm Preparing... ########################################### [100%] 1:necpciras ########################################### [100%] 3. Run the following command to check if neccapd package is installed correctly. # rpm -qa grep necpciras necpciras-2.4-1.02.el6.x86_64 2.2 Uninstalling necpciras 1. Login to the target machine as a root user. 2. Uninstall necpciras package by running rpm command. # rpm -e necpciras 3. Run the following command to check if neccapd package is installed correctly. # rpm -qa grep necpciras Uninstallation is completed successfully if no response is displayed against the command. Configuration by necpciras command is preserved after uninstallation. 2.3 Upgrading necpciras Upgrade necpciras as follows: Uninstall the old necpciras package according to "2.2 Uninstalling necpciras", then install the new necpciras according to "2.1 Installing necpciras". 3

4

2.4 Configuration by necpciras necpcirs command is used to display information related to PCIe Live Error Recovery feature and to set LER mode settings. See "3. Necpciras Command Reference" for details of command line of necpciras command. Some settings require to system (OS) reboot to apply the settings. Factory default setting is NoLER mode. Table 2-1 necpciras command options Option Use case Reboot --set-ler --set-noler --reset --set-threshold Use this option to set PCIe slots as LER mode. Use this option to set PCIe slots as NoLER mode. Use this option to restore factory default settings. Use this option to specify recovery threshold of uncorrected error. Required Required Required Required LER Mode must be set to supported cards only for PCIe Live Error Recovery. Tips This feature improves more the I/O availability by a combination of redundant I/O features such as NIC Teaming.. 2.5 Backup Configuration Information Information configured by necpciras is stored in hardware of the server, not in the file system of OS. If you change configuration information, be sure to backup the configuration information using web console. Reboot or shutdown the system before starting backup process. Described below is procedure to backup configuration information using web console. Refer to "Express5800/A1040c, A2040c, A2020c, A2010c User s Guide" for detailed information and operation screen images. Backup procedure 1. Reboot or shutdown the system. 2. Select the [Configuration] on web console. 5

3. Select [Save/Restore in Bulk] on web console. 4. Press the [Backup] button to download the file containing configuration information. Refer to "Express5800/A1040c, A2040c, A2020c, A2010c User s Guide" for how to restore the configuration information using the backup file obtained from web console. 6

3. Necpciras Command Reference This section describes details of necpciras command used to view or configure information related to Mission Critical I/O Failover. For how to install necpciras, see 2.1 Installing necpciras. 3.1 necpciras command line format necpciras subcommand [<options>] subcommand: --show See [3.2]. --set-ler=<pci_slot_numbers> See [3.3]. --set-noler=<pci_slot_numbers> See [3.4]. --set-threshold=<threshold> See [3.5]. --reset See [3.6]. --version See [3.7]. PCI_SLOT_NUMBERS: List the number of PCIe slots delimiting with slash. THRESHOLD: Recovery threshold 3.2 --show option Shows the current settings of PCIe Live Error Recovery feature. Suboption None Execution resultex #./necpciras --show LER Settings: LER LER Slot Status Current Next PCI1 Enable No No PCI2 N/A No No PCI3 Enable No No PCI4 N/A No No PCI5 N/A No No PCI6 N/A No No PCI7 N/A No No PCI8 N/A No No PCI9 N/A No No PCI10 N/A No No PCI11 N/A No No PCI12 N/A No No PCI13 N/A No No PCI14 N/A No No PCI15 N/A No No PCI16 N/A No No LER threshold Setting: Current Next Threshold 1 1 7

Description Item Displayed character string Table 3-1 necpciras show option Meaning Slot Status LER Current LER Next Threshold Current Threshold Next Enable N/A Yes No Yes No Value Value The PCIe slot is available. The PCIe slot is not available. The PCIe slot is set as LER mode. The PCIe slot is set as NoLER mode. Empty PCIe slot is displayed as NoLER mode. The PCIe slot will be set as LER mode on next boot or PCIe hot-add. The PCIe slot will be set as NoLER mode on next boot. Current recovery threshold is shown. Recovery will be performed until the threshold. Next recovery threshold is shown on next boot. 3.3 --set-ler option Specify LER mode of each PCIe slot. LER Mode must be set to supported cards for PCIe Live Error Recovery. Tips This feature improves more the I/O availability by a combination of redundant I/O features such as NIC Teaming. Reboot the system to apply the settings. Suboption --set-ler=<pci_slot_numbers> (Ex. set-ler=9/10) Specify the PCIe slot numbers of LER slot, by delimiting with the slash. Execution resultex When command is executed successfully, the same contents as --show option is displayed. If command fails due to an illegal argument or others, an error message or usage of necpciras is displayed. 8

#./necpciras --set-ler=9/10 LER Settings: LER LER Slot Status Current Next PCI1 Enable No No PCI2 N/A No No PCI3 Enable No No PCI4 N/A No No PCI5 N/A No No PCI6 N/A No No PCI7 N/A No No PCI8 N/A No No PCI9 Enable Yes No PCI10 Enable Yes No PCI11 N/A No No PCI12 N/A No No PCI13 N/A No No PCI14 N/A No No PCI15 N/A No No PCI16 N/A No No LER threshold Setting: Current Next Threshold 1 1 ** NOTICE ** The configuration changes have not been applied yet. You must reboot the system to apply them. Tips Empty PCIe slot is allowed to be set as LER mode. When PCIe Hot-Add, the slot will be set as LER mode. 3.4 --set-noler option Specify NoLER mode of each PCIe slot. Reboot the system to apply the settings. Suboption --set-noler=<pci_slot_numbers> (e.g. --set-ler=9/10) Specify the PCIe slot numbers of NoLER slot, by delimiting with the slash. Execution resultex When command is executed successfully, the same contents as --show option is displayed. If command fails due to an illegal argument or others, an error message or usage of necpciras is displayed. 9

#./necpciras --set-noler=9/10 LER Settings: LER LER Slot Status Current Next PCI1 Enable No No PCI2 N/A No No PCI3 Enable No No PCI4 N/A No No PCI5 N/A No No PCI6 N/A No No PCI7 N/A No No PCI8 N/A No No PCI9 Enable Yes No PCI10 Enable Yes No PCI11 N/A No No PCI12 N/A No No PCI13 N/A No No PCI14 N/A No No PCI15 N/A No No PCI16 N/A No No LER threshold Setting: Current Next Threshold 1 1 ** NOTICE ** The configuration changes have not been applied yet. You must reboot the system to apply them. 3.5 --set-threshold option Specify recovery threshold of uncorrected error. Recovery will be performed until the threshold. The number of recovery will be counted for each slot. If the number of recovery exceeded the threshold in a certain slot, the system will be down in order to prevent unsafe recovery. If recovery threshold is zero, recovery will done up to infinity. Example: If the threshold is 3 and the uncorrectable error occurs 3 times on the PCIe slot 9, recovery will be performed for 3 times in PCIe slot 9. Then if the 4 th uncorrectable occurs on the PCIe slot 9, the system will be down. Tips If in the same PCIe card error occurred repeatedly, there is a high possibility of complete failure. In this case, downing system may be safer than running. Determine the threshold according to the use environment. Default recovery. Default recovery threshold is one. Reboot the system to apply the settings. Suboption --set-threshold=<threshold> (e.g. --set-threshold=3) Specify the decimal recovery threshold Execution resultex When command is executed successfully, the same contents as --show option is displayed. If command fails due to an illegal argument or others, an error message or usage of necpciras is displayed. 10

#./necpciras --set-threshold=3 LER Settings: LER LER Slot Status Current Next PCI1 Enable No No PCI2 N/A No No PCI3 Enable No No PCI4 N/A No No PCI5 N/A No No PCI6 N/A No No PCI7 N/A No No PCI8 N/A No No PCI9 Enable Yes Yes PCI10 Enable Yes Yes PCI11 N/A No No PCI12 N/A No No PCI13 N/A No No PCI14 N/A No No PCI15 N/A No No PCI16 N/A No No LER threshold Setting: Current Next Threshold 1 3 ** NOTICE ** The configuration changes have not been applied yet. You must reboot the system to apply them. 3.6 --reset option Reset ler settings. Reboot the system to apply the settings. Suboption None Execution resultex LER settings are reset. And the same contents as --show option is displayed. If command fails due to an illegal argument or others, an error message or usage of necpciras is displayed. 11

#./necpciras --reset LER Settings: LER LER Slot Status Current Next PCI1 Enable No No PCI2 N/A No No PCI3 Enable No No PCI4 N/A No No PCI5 N/A No No PCI6 N/A No No PCI7 N/A No No PCI8 N/A No No PCI9 Enable Yes No PCI10 Enable Yes No PCI11 N/A No No PCI12 N/A No No PCI13 N/A No No PCI14 N/A No No PCI15 N/A No No PCI16 N/A No No LER threshold Setting: Current Next Threshold 3 1 ** NOTICE ** The configuration changes have not been applied yet. You must reboot the system to apply them. 3.7 --version option Shows version information of necpciras. Suboption None Execution result The version information is displayed by the following format: #./necpciras --version necpciras Version 1.3 3.8 Usage If command fails due to an illegal argument or others, usage of necpciras is displayed. Suboption None 12

Execution result #./necpciras Usage:./necpciras --show Usage:./necpciras --reset Usage:./necpciras --set-ler=<pci_slot_numbers> Usage:./necpciras --set-noler=<pci_slot_numbers> Usage:./necpciras --set-threshold=<threshold> Default value of LER Setting is "No". <PCI_SLOT_NUMBERS> is pci slot numbers separated by a slash. ex) --set-ler=1/6 : slot1 and slot6 are set to LER. <THRESHOLD> is LER threshold value of 0-255. Default value is "1". 0 : No specified threshold. PCI Error Recovery will be performed for every uncorrectable error. 1-255 : Specify threshold value. ex) --set-threshold=1 : PCI Error Recovery will be performed only for 1st uncorrectable error. ex) --set-threshold=2 : PCI Error Recovery will be performed for 1st and 2nd uncorrectable error. 13

Express5800/A2040c, A2020c, A2010c, A1040c PCIe Live Error Recovery User s Guide (Release 1.0) NEC Corporation 7-1 Shiba 5-Chome, Minato-Ku Tokyo 108-8001, Japan TEL (03) 3454-1111 (Main phone number) NEC Corporation 2015 No part of this manual may be reproduced in any form without the prior written permission of NEC Corporation. 14