IBM System Storage SAN Volume Controller. Troubleshooting Guide IBM

Size: px
Start display at page:

Download "IBM System Storage SAN Volume Controller. Troubleshooting Guide IBM"

Transcription

1 IBM System Storge SAN Volume Controller Troubleshooting Guide IBM

2 Note Before using this informtion nd the product it supports, red the informtion in Notices on pge 309. This edition pplies to version of IBM SAN Volume Controller nd to ll subsequent modifictions until otherwise indicted in new editions. Copyright IBM Corportion 2003, US Government Users Restricted Rights Use, dupliction or disclosure restricted by GSA ADP Schedule Contrct with IBM Corp.

3 Contents Figures v Tbles vii About this guide ix Who should use this guide ix Emphsis ix Librry nd relted publictions ix IBM Publictions Center xi Relted websites xii Sending comments xii How to get informtion, help, nd technicl ssistnce xii Chpter 1. SAN Volume Controller overview Systems Configurtion node Configurtion node ddressing Mngement IP filover SAN fbric overview Chpter 2. Introducing the SAN Volume Controller hrdwre components SAN Volume Controller nodes Optionl fetures Node controls nd indictors Node opertor-informtion pnel Node rer-pnel indictors nd connectors Fibre Chnnel port numbers nd worldwide port nmes Requirements for the SAN Volume Controller environment Prts listing SAN Volume Controller 2145-SV1 prts SAN Volume Controller 2145-DH8 prts SAN Volume Controller F expnsion enclosure prts SAN Volume Controller F expnsion enclosure prts SAN Volume Controller F expnsion enclosure prts Chpter 3. User interfces for servicing your system Mngement GUI interfce When to use the mngement GUI Accessing the mngement GUI Deleting node from clustered system by using the mngement GUI Adding node to system Service ssistnt interfce When to use the service ssistnt Accessing the service ssistnt Commnd-line interfce When to use the CLI Accessing the system CLI Service commnd-line interfce When to use the service CLI Accessing the service CLI USB flsh drive interfce Technicin port Chpter 4. Performing recovery ctions using the SAN Volume Controller CLI. 71 Vlidting nd repiring mirrored volume copies by using the CLI Repiring thin-provisioned volume using the CLI 72 Recovering offline volumes using the CLI Chpter 5. Viewing the vitl product dt Downloding the vitl product dt using the mngement GUI Displying the vitl product dt using the CLI.. 75 Displying node properties by using the CLI.. 75 Displying clustered system properties by using the CLI Fields for the node VPD Fields for the system VPD Chpter 6. Dignosing problems Strting sttistics collection Event reporting Power-on self-test Understnding events Mnging the event log Viewing the event log Describing the fields in the event log Event notifictions Inventory informtion emil Understnding the error codes Using the error code tbles Event IDs SCSI event reporting Object types Error event IDs nd error codes Resolving problem with the SAN Volume Controller boot drives Resolving problem with filure to boot Node error code overview Error code rnge Procedure: SAN problem determintion Resolving problem with SSL/TLS clients Procedure: Mking drives support protection informtion Resolving problem with new expnsion enclosures Fibre Chnnel nd 10G Ethernet link filures Copyright IBM Corp. 2003, 2017 iii

4 Ethernet iscsi host-link problems Fibre Chnnel over Ethernet host-link problems 238 Servicing storge systems Chpter 7. Disster recovery Chpter 8. Recovery procedures Recover system procedure When to run the recover system procedure Fix hrdwre errors Removing system informtion for nodes with error code 550 or error code 578 using the service ssistnt Running system recovery by using the service ssistnt Recovering from offline volumes using the CLI 249 Wht to check fter running the system recovery Bcking up nd restoring the system configurtion 252 Bcking up the system configurtion using the CLI Restoring the system configurtion Deleting bckup configurtion files using the CLI Completing the node rescue when the node boots 261 Chpter 9. Understnding the medium errors nd bd blocks Chpter 10. Using the mintennce nlysis procedures MAP 5000: Strt MAP 5040: Power SAN Volume Controller 2145-DH MAP 5350: Powering off node Using the mngement GUI to power off system Using the system CLI to power off node Using the system power control button MAP 5500: Ethernet Defining n lternte configurtion node MAP 5550: 10G Ethernet nd Fibre Chnnel over Ethernet personlity enbled dpter port MAP 5600: Fibre Chnnel MAP 5700: Repir verifiction MAP 5800: Light pth Light pth for SAN Volume Controller 2145-DH Chpter 11. iscsi performnce nlysis nd tuning Appendix A. Accessibility fetures for the system Appendix B. Where to find the Sttement of Limited Wrrnty Notices Trdemrks Product support sttement Homologtion sttement Electromgnetic comptibility notices Cnd Notice Europen Community nd Morocco Notice Germny Notice Jpn Electronics nd Informtion Technology Industries Assocition (JEITA) Notice Jpn Voluntry Control Council for Interference (VCCI) Notice Kore Notice People's Republic of Chin Notice Russi Notice Tiwn Notice United Sttes Federl Communictions Commission (FCC) Notice Index iv SAN Volume Controller: Troubleshooting Guide

5 Figures 1. Exmple of system in fbric Dt flow in system Exmple of bsic volume Exmple of mirrored volumes Exmple of stretched volumes Exmple of HyperSwp volumes Exmple of stndrd system topology Exmple of stretched system topology Exmple of HyperSwp system topology I/O groups with flsh drives Configurtion node SAN Volume Controller 2145-SV1 front pnel SAN Volume Controller 2145-DH8 front pnel SAN Volume Controller 2145-SV1 opertor-informtion pnel SAN Volume Controller 2145-DH8 opertor informtion pnel SAN Volume Controller 2145-SV1 rer-pnel indictors SAN Volume Controller 2145-DH8 rer-pnel indictors Connectors on the rer of the SAN Volume Controller 2145-SV Power connector SAN Volume Controller 2145-SV1 service ports SAN Volume Controller 2145-SV1 unused Ethernet port Fibre Chnnel port numbers in typicl configurtion Ethernet port numbers for iscsi communiction Connectors on the rer of the SAN Volume Controller 2145-DH Power connector SAN Volume Controller 2145-DH8 service ports SAN Volume Controller 2145-DH8 unused Ethernet port Fibre Chnnel LEDs SAN Volume Controller 2145-DH8 AC, DC, nd power-error LEDs SAN Volume Controller 2145-DH8 replceble prts in exploded view digrm SV1 technicin port DH8 technicin port Exmple of inventory informtion emil Exmple of inventory informtion emil Node rescue disply SAN Volume Controller 2145-SV1 opertor-informtion pnel SAN Volume Controller 2145-DH8 opertor-informtion pnel SAN Volume Controller 2145-DH8 front pnel Power LED on the SAN Volume Controller 2145-DH Power LED indictor on the rer pnel of the SAN Volume Controller 2145-DH AC, dc, nd power-supply error LED indictors on the rer pnel of the SAN Volume Controller 2145-DH Power control button on the SAN Volume Controller 2145-DH8 model Power control button nd LED lights on the SAN Volume Controller 2145-SV1 model Ethernet ports on the rer of the SAN Volume Controller 2145-DH SAN Volume Controller 2145-DH8 opertor-informtion pnel Press the relese ltch SAN Volume Controller 2145-DH8 light pth dignostics pnel SAN Volume Controller 2145-DH8 system bord LEDs Copyright IBM Corp. 2003, 2017 v

6 vi SAN Volume Controller: Troubleshooting Guide

7 Tbles 1. IBM websites for help, services, nd informtion x 2. SAN Volume Controller librry x 3. IBM documenttion nd relted websites xi 4. IBM websites for help, services, nd informtion xii 5. System topology nd volume summry System communictions types Optionl fetures nd models PCI express expnsion slot rules for 2145-SV1 nodes PCI express expnsion slot rules for 2145-DH8 nodes The PCIe expnsion slots in which n dpter cn be used Link sttus vlues for Fibre Chnnel LEDs Input-voltge requirements Power consumption Physicl specifictions Dimensions nd weight Additionl spce requirements Mximum het output of ech SAN Volume Controller 2145-SV1 node Input-voltge requirements Power consumption Physicl specifictions Dimensions nd weight Additionl spce requirements Mximum het output of ech 2145-DH8 node FRUs in the SAN Volume Controller 2145-SV1 prts ssembly FRUs in the SAN Volume Controller 2145-DH8 prts ssembly FRUs to which SAN Volume Controller 2145-DH8 service procedures do not refer FRU prts for the long-wve smll form-fctor pluggble (SFP) trnsceiver feture Supported expnsion enclosure SAS drives Other expnsion enclosure prts Expnsion enclosure field replceble units Drive field replceble units Cble field replceble SAS units Cble field replceble power units Expnsion enclosure field replceble units Smll-form fctor SAS drives field replceble units Cble field replceble units Fields for the system bord Fields for the btteries Fields for the processors Fields for the fns Fields tht re repeted for ech instlled memory module Fields tht re repeted for ech dpter tht is instlled Fields tht re repeted for ech SCSI, IDE, SATA, nd SAS device tht is instlled Fields tht re specific to the node softwre Fields tht re provided for the front pnel ssembly Fields tht re provided for the Ethernet port Fields tht re provided for the power supplies in the node Fields tht re provided for the SAS host bus dpter (HBA) Fields tht re provided for the SAS flsh drive Fields tht re provided for the smll form fctor pluggble (SFP) trnsceiver Fields tht re provided for the system properties Sttistics collection for individul nodes Sttistic collection for volumes for individul nodes Sttistic collection for volumes tht re used in Metro Mirror nd Globl Mirror reltionships for individul nodes Sttistic collection for node ports Sttistic collection for nodes Cche sttistics collection for volumes nd volume copies Sttistic collection for volume cche per individul nodes XML sttistics for n IP Prtnership port ODX VDisk nd node level sttistics Sttistics collection for cloud per cloud ccount id Sttistics collection for cloud per VDisk Description of dt fields for the event log Notifiction levels System notifiction types nd corresponding syslog level codes System vlues of user-defined messge origin identifiers nd syslog fcility codes Informtionl events SCSI sttus SCSI sense keys, codes, nd qulifiers Reson codes Object types Error event IDs nd error codes Messge clssifiction number rnge Files creted by the bckup process Bd block errors Fibre Chnnel ssemblies System Fibre Chnnel dpter connection hrdwre Dignostics pnel LEDs Copyright IBM Corp. 2003, 2017 vii

8 viii SAN Volume Controller: Troubleshooting Guide

9 About this guide Who should use this guide This guide describes how to troubleshoot the IBM SAN Volume Controller. The chpters tht follow introduce you to the SAN Volume Controller, expnsion enclosure, the redundnt AC-power switch, nd the uninterruptible power supply. They describe how you cn configure nd check the sttus of one SAN Volume Controller node or clustered system of nodes through the front pnel, with the service ssistnt GUI, or with the mngement GUI. The vitl product dt (VPD) chpter provides informtion bout the VPD tht uniquely defines ech hrdwre nd microcode element tht is in the SAN Volume Controller. You cn lso lern how to dignose problems using the SAN Volume Controller. The mintennce nlysis procedures (MAPs) cn help you nlyze filures tht occur in SAN Volume Controller. With the MAPs, you cn isolte the field-replceble units (FRUs) of the SAN Volume Controller tht fil. Begin ll problem determintion nd repir procedures from MAP 5000: Strt on pge 265. This guide is intended for the system dministrtor or systems services representtive who uses nd dignoses problems with the SAN Volume Controller, the redundnt AC-power switch, nd the uninterruptible power supply. Emphsis Different typefces re used in this guide to show emphsis. The following typefces re used to show emphsis. Emphsis Boldfce Bold monospce Itlics Monospce Mening Text in boldfce represents menu items. Text in bold monospce represents commnd nmes. Text in itlics is used to emphsize word. In commnd syntx, it is used for vribles for which you supply ctul vlues, such s defult directory or the nme of system. Text in monospce identifies the dt or commnds tht you type, smples of commnd output, exmples of progrm code or messges from the system, or nmes of commnd flgs, prmeters, rguments, nd nme-vlue pirs. Librry nd relted publictions Product mnuls, other publictions, nd websites tht contin informtion tht is relted to your system re vilble. Copyright IBM Corp. 2003, 2017 ix

10 IBM Knowledge Center for SAN Volume Controller The informtion collection in the IBM Knowledge Center contins ll of the informtion tht is required to instll, configure, nd mnge the system. The informtion collection in the IBM Knowledge Center is updted between product releses to provide the most current documenttion. The informtion collection is vilble t the following website: SAN Volume Controller librry Unless otherwise noted, the publictions in the librry re vilble in Adobe portble document formt (PDF) from website. ibm.com/shop/publictions/order Click Serch for publictions to find the online publictions you re interested in, nd then view or downlod the publiction by clicking the pproprite item. Tble 1 lists websites where you cn find help, services, nd more informtion. Tble 1. IBM websites for help, services, nd informtion Website Directory of worldwide contcts Support for SAN Volume Controller (2145) Support for IBM System Storge nd IBM TotlStorge products Address Ech PDF publiction in the Tble 2 librry is lso vilble in the IBM Knowledge Center by clicking the title in the Link to PDF column: Tble 2. SAN Volume Controller librry Title Description Link to PDF file IBM SAN Volume Controller Model 2145-SV1 Hrdwre Instlltion Guide IBM SAN Volume Controller Hrdwre Mintennce Guide The guide provides the instructions tht the IBM service representtive uses to instll the hrdwre for SAN Volume Controller model 2145-SV1. The guide provides the instructions tht the IBM service representtive uses to service the SAN Volume Controller hrdwre, including the removl nd replcement of prts. Hrdwre Instlltion Guide [PDF] Hrdwre Mintennce Guide [PDF] x SAN Volume Controller: Troubleshooting Guide

11 Tble 2. SAN Volume Controller librry (continued) Title Description Link to PDF file IBM SAN Volume Controller Troubleshooting Guide IBM Spectrum Virtulize for Public Cloud, IBM Spectrum Virtulize for SAN Volume Controller nd Storwize Fmily Commnd-Line Interfce User's Guide The guide describes the fetures of ech SAN Volume Controller model, explins how to use the front pnel or service ssistnt GUI, nd provides mintennce nlysis procedures to help you dignose nd solve problems with the SAN Volume Controller. The guide describes the commnds tht you cn use from the SAN Volume Controller commnd-line interfce (CLI). Troubleshooting Guide [PDF] Commnd-Line Interfce User's Guide [PDF] IBM documenttion nd relted websites Tble 3 lists websites tht provide publictions nd other informtion bout the SAN Volume Controller or relted products or technologies. The IBM Redbooks publictions provide positioning nd vlue guidnce, instlltion nd implementtion experiences, solution scenrios, nd step-by-step procedures for vrious products. Tble 3. IBM documenttion nd relted websites Website IBM Publictions Center IBM Redbooks publictions Address ibm.com/shop/publictions/order IBM Publictions Center Relted ccessibility informtion To view PDF file, you need Adobe Reder, which cn be downloded from the Adobe website: The IBM Publictions Center is worldwide centrl repository for IBM product publictions nd mrketing mteril. The IBM Publictions Center website offers customized serch functions to help you find the publictions tht you need. You cn view or downlod publictions t no chrge. Access the IBM Publictions Center through the following website: ibm.com/shop/publictions/order About this guide xi

12 Relted websites The following websites provide informtion bout the system, relted products, or technologies. Type of informtion SAN Volume Controller support Technicl support for IBM storge products IBM Electronic Support registrtion Website www-01.ibm.com/support/electronicsupport/ Sending comments Your feedbck is importnt in helping to provide the most ccurte nd highest qulity informtion. Procedure To submit ny comments bout this publiction or ny other IBM storge product documenttion: Send your comments by emil to Be sure to include the following informtion: v Exct publiction title nd version v Pge, tble, or illustrtion numbers tht you re commenting on v A detiled description of ny informtion tht should be chnged How to get informtion, help, nd technicl ssistnce If you need help, service, technicl ssistnce, or wnt more informtion bout IBM products, you cn find wide vriety of sources vilble from IBM to ssist you. Informtion IBM mintins pges on the web where you cn get informtion bout IBM products nd fee services, product implementtion nd usge ssistnce, brek nd fix service support, nd the ltest technicl informtion. For more informtion, refer to Tble 4. Tble 4. IBM websites for help, services, nd informtion Website Directory of worldwide contcts Support for SAN Volume Controller (2145) Support for IBM System Storge nd IBM TotlStorge products Address Note: Avilble services, telephone numbers, nd web links re subject to chnge without notice. xii SAN Volume Controller: Troubleshooting Guide

13 Help nd service Before you cll for support, be sure to hve your IBM Customer Number vilble. If you re in the US or Cnd, you cn cll 1 (800) IBM SERV for help nd service. From other prts of the world, see for the number tht you cn cll. When you cll from the US or Cnd, choose the storge option. The gent decides where to route your cll, to either storge softwre or storge hrdwre, depending on the nture of your problem. If you cll from somewhere other thn the US or Cnd, you must choose the softwre or hrdwre option when you cll for ssistnce. Choose the softwre option if you re uncertin if the problem involves the SAN Volume Controller softwre or hrdwre. Choose the hrdwre option only if you re certin the problem solely involves the SAN Volume Controller hrdwre.when you ll IBM to service the product, follow these guidelines for the softwre nd hrdwre options: Softwre option Identify the SAN Volume Controller product s your product nd supply your customer number s proof of purchse. The customer number is 7-digit number ( ) ssigned by IBM when the product is purchsed. Your customer number might be on the customer informtion worksheet or on the invoice from your storge purchse. If sked for n operting system, use Storge. Hrdwre option Provide the seril number nd pproprite 4-digit mchine type. For SAN Volume Controller, the mchine type is In the US nd Cnd, hrdwre service nd support cn be extended to 24x7 on the sme dy. The bse wrrnty is 9x5 on the next business dy. Getting help online You cn find informtion bout products, solutions, prtners, nd support on the IBM website. To find up-to-dte informtion bout products, services, nd prtners, visit the IBM website t Before you cll Mke sure tht you tke steps to try to solve the problem yourself before you cll. Some suggestions for resolving the problem before you cll IBM Support include: v Check ll cbles to mke sure tht they re connected. v Check ll power switches to mke sure tht the system nd optionl devices re turned on. v Use the troubleshooting informtion in your system documenttion. The troubleshooting section of the knowledge center contins procedures to help you dignose problems. v Go to the IBM Support website t to check for technicl informtion, hints, tips, nd new device drivers or to submit request for informtion. About this guide xiii

14 Using the documenttion Informtion bout your IBM storge system is vilble in the documenttion tht comes with the product. Tht documenttion includes printed documents, online documents, redme files, nd help files in ddition to the knowledge center. See the troubleshooting informtion for dignostic instructions. The troubleshooting procedure might require you to downlod updted device drivers or softwre. IBM mintins pges on the web where you cn get the ltest technicl informtion nd downlod device drivers nd updtes. To ccess this informtion, go to support nd follow the instructions. Also, some documents re vilble through the IBM Publictions Center. Sign up for the Support Line Offering If you hve questions bout how to use nd configure the mchine, sign up for the IBM Support Line offering to get professionl nswer. The mintennce tht is supplied with the system provides support when there is problem with hrdwre component or fult in the system mchine code. At times, you might need expert dvice bout using function tht is provided by the system or bout how to configure the system. Purchsing the IBM Support Line offering gives you ccess to this professionl dvice for your system, nd in the future. Contct your locl IBM sles representtive or your support group for vilbility nd purchse informtion. xiv SAN Volume Controller: Troubleshooting Guide

15 Chpter 1. SAN Volume Controller overview The SAN Volume Controller system combines softwre nd hrdwre into comprehensive, modulr pplince tht uses symmetric virtuliztion. Symmetric virtuliztion is chieved by creting pool of mnged disks (MDisks) from the ttched storge systems. Those storge systems re then mpped to set of volumes for use by ttched host systems. System dministrtors cn view nd ccess common pool of storge on the storge re network (SAN). This functionlity helps dministrtors to use storge resources more efficiently nd provides common bse for dvnced functions. A SAN is high-speed Fibre Chnnel network tht connects host systems nd storge devices. In SAN, host system cn be connected to storge device cross the network. The connections re mde through units such s routers nd switches. The re of the network tht contins these units is known s the fbric of the network. IBM Rel-time Compression softwre IBM SAN Volume Controller system is built with IBM Spectrum Virtulize softwre, which is prt of the IBM Spectrum Storge fmily. IBM Spectrum Virtulize is key member of the IBM Spectrum Storge portfolio. It is highly flexible storge solution tht enbles rpid deployment of block storge services for new nd trditionl worklods, on-premises, off-premises nd in combintion of both. Designed to help enble cloud environments, it is bsed on the proven technology. For more informtion bout the IBM Spectrum Storge portfolio, see the following website. The softwre provides these functions for the host systems tht ttch to the system: v Cretes single pool of storge v Provides logicl unit virtuliztion v Mnges logicl volumes v Mirrors logicl volumes The system lso provides the following functions: v Lrge sclble cche v Copy Services: IBM FlshCopy (point-in-time copy) function, including thin-provisioned FlshCopy to mke multiple trgets ffordble IBM HyperSwp (ctive-ctive copy) function Metro Mirror (synchronous copy) Globl Mirror (synchronous copy) Dt migrtion v Spce mngement: Copyright IBM Corp. 2003,

16 IBM Esy Tier function to migrte the most frequently used dt to higher-performnce storge Metering of service qulity when combined with IBM Spectrum Control Bse Edition. For informtion, refer to the IBM Spectrum Control Bse Edition documenttion. Thin-provisioned logicl volumes Compressed volumes to consolidte storge Figure 1 shows hosts, system nodes, nd RAID storge systems connected to SAN fbric. The redundnt SAN fbric comprises fult-tolernt rrngement of two or more counterprt SANs tht provide lterntive pths for ech SAN-ttched device. Host Host Host Host Host zone Node Node Redundnt SAN fbric Node RAID storge system RAID storge system Storge system zone svc00600 Figure 1. Exmple of system in fbric Volumes System nodes present volumes to the hosts. Most of the dvnced system functions re defined on volumes. These volumes re creted from mnged disks (MDisks) tht re presented by the RAID storge systems. The volumes cn lso be creted by rrys tht re provided by flsh drives in n expnsion enclosure. All dt trnsfer occurs through the system node, which is described s symmetric virtuliztion. Figure 2 on pge 3 shows the dt flow cross the fbric. 2 SAN Volume Controller: Troubleshooting Guide

17 Host Host Host Host Hosts send I/O to volumes. Node Redundnt SAN fbric Node I/O is sent to mnged disks. Dt trnsfer RAID storge system RAID storge system svc00601 Figure 2. Dt flow in system The nodes in system re rrnged into pirs tht re known s I/O groups. A single pir is responsible for serving I/O on volume. Becuse volume is served by two nodes, no loss of vilbility occurs if one node fils or is tken offline. The Asymmetric Logicl Unit Access (ALUA) fetures of SCSI re used to disble the I/O for node before it is tken offline or when volume cnnot be ccessed vi tht node. Volume types You cn crete the following types of volumes on the system: v Bsic volumes, where single copy of the volume is cched in one I/O group. Bsic volumes cn be estblished in ny system topology; however, Figure 3 on pge 4 shows stndrd system topology. Chpter 1. SAN Volume Controller overview 3

18 Stndrd System I/O Group 1 Bsic Volume Volume Copy svc00909 Figure 3. Exmple of bsic volume v Mirrored volumes, where copies of the volume cn either be in the sme storge pool or in different storge pools. As Figure 4 shows, the volume is cched in single I/O group. Typiclly, mirrored volumes re estblished in stndrd system topology. Stndrd System I/O Group 1 Mirrored Volume Volume Copy Volume Copy svc00908 Figure 4. Exmple of mirrored volumes v Stretched volumes, where copies of single volume re in different storge pools t different sites. The volume is cched in one I/O group. Stretched volumes re only vilble in stretched topology systems. 4 SAN Volume Controller: Troubleshooting Guide

19 Stretched System I/O Group 1 Stretched Volume Volume Copy Volume Copy Site 1 Site 2 svc00907 Figure 5. Exmple of stretched volumes v HyperSwp volumes, where copies of single volume re in different storge pools tht re on different sites. The volume is cched in two I/O groups tht re on different sites. These volumes cn be creted only when the system topology is HyperSwp. Chpter 1. SAN Volume Controller overview 5

20 HyperSwp System I/O Group 1 I/O Group 2 HyperSwp Volume Active-ctive reltionship Volume Copy Volume Copy Chnge Volume Chnge Volume Volume Copy Volume Copy Site 1 Site 2 svc00906 Figure 6. Exmple of HyperSwp volumes System topology The topology property of system cn be set to one of the following sttes: Note: You cnnot mix I/O groups of different topologies in the sme system. v Stndrd topology, where ll nodes in the system re t the sme site. 6 SAN Volume Controller: Troubleshooting Guide

21 I/O Group 1 Node 1 Node 2 Site 1 svs00919 Figure 7. Exmple of stndrd system topology v Stretched topology, where ech node of n I/O group is t different site. When one site is not vilble, ccess to volume cn continue but with reduced performnce. I/O Group 1 Node 1 Node 2 Site 1 Site 2 svs00920 Figure 8. Exmple of stretched system topology v HyperSwp topology, where the system consists of t lest two I/O groups. Ech I/O group is t different site. Both nodes of n I/O group re t the sme site. A volume cn be ctive on two I/O groups so tht it cn immeditely be ccessed by the other site when site is not vilble. Chpter 1. SAN Volume Controller overview 7

22 I/O Group 1 I/O Group 2 Node 1 Node 3 Node 2 Node 4 Site 1 Site 2 svs00921 Figure 9. Exmple of HyperSwp system topology Summry of system topology nd volumes Tble 5 summrizes the types of volumes tht cn be ssocited with ech system topology. Tble 5. System topology nd volume summry Topology Volume Type Bsic Mirrored Stretched HyperSwp Custom Stndrd X X X Stretched X X X HyperSwp X X X System mngement A system is composed of individul nodes tht present single point of control for system mngement nd service. System mngement nd error reporting re provided through n Ethernet interfce to one of the nodes in the system, which is clled the configurtion node. The configurtion node runs web server nd provides commnd-line interfce (CLI). Any node in the system cn be the configurtion node. If the current configurtion node fils, new configurtion node is selected from the remining nodes. Ech node lso provides commnd-line interfce nd web interfce for inititing hrdwre service ctions. 8 SAN Volume Controller: Troubleshooting Guide

23 Fbric types I/O opertions between hosts nd system nodes nd between the nodes nd rrys use the SCSI stndrd. The nodes communicte with ech other through privte SCSI commnds. All nodes tht run system softwre version 6.4 or lter cn support Fibre Chnnel over Ethernet (FCoE) connectivity. Tble 6 shows the fbric type tht cn be used for communicting between hosts, nodes, nd RAID storge systems. These fbric types cn be used t the sme time. Tble 6. System communictions types Communictions type Host to system nodes System nodes to storge system System nodes to system nodes Fibre Chnnel SAN Yes Yes Yes iscsi (1 Gbps Ethernet or 10 Gbps Ethernet, depending on the node) Fibre Chnnel Over Ethernet SAN (10 Gbps Ethernet) Yes Yes No Yes Yes Yes Flsh drives Some system nodes re ttched to expnsion enclosures tht contin flsh drives. These flsh drives cn be used to crete RAID-mnged disks (MDisks) tht in turn cn be used to crete volumes. Flsh drives re in n expnsion enclosure tht is connected to both sides of n I/O group. Flsh drives provide host servers with pool of high-performnce storge for criticl pplictions. Figure 10 shows this configurtion. MDisks on flsh drives cn lso be plced in storge pool with MDisks from regulr RAID storge systems. IBM Esy Tier performs utomtic dt plcement within tht storge pool by moving high-ctivity dt onto better-performing storge. Hosts send I/O to volumes, which re mpped to internl solid-stte drives. Host Host Host Host Node with SSDs Redundnt SAN fbric svc00602 Figure 10. I/O groups with flsh drives Chpter 1. SAN Volume Controller overview 9

24 SAN Volume Controller nodes Ech node is n individul server in SAN Volume Controller clustered system on which the SAN Volume Controller softwre runs. The nodes re lwys instlled in pirs; minimum of one pir nd mximum of four pirs of nodes constitute system. Ech pir of nodes is known s n I/O group. I/O groups tke the storge tht is presented to the SAN by the storge systems s MDisks nd trnsforms the storge into logicl disks (volumes) tht re used by pplictions on the hosts. A node is in only one I/O group nd provides ccess to the volumes in tht I/O group. SAN Volume Controller 2145-SV1 node fetures The SAN Volume Controller 2145-SV1 system hs the following fetures. v A 19-inch rck-mounted enclosure v Two 8-core processors v 64 GB bse memory per processor. Optionlly, by dding 64 GB of memory, the processor cn support 128 GB, 192 GB, or 256 GB of memory. v Eight smll form fctor (SFF) drive bys t the front of the control enclosure v Support for vrious optionl host dpters, including: 4-port 16 Gbps Fibre Chnnel dpters 4-port 10 Gbps Fibre Chnnel over Ethernet (FCoE) dpters for host ttchment 4-port 12 Gbps SAS crds to ttch to expnsion enclosures v Support for iscsi host ttchment (10 Gbps Ethernet) v Support for expnsion enclosures to support more drives SAN Volume Controller F expnsion enclosure houses up to 92 flsh drives (SFF or LFF drives) nd two secondry expnder modules SAN Volume Controller F houses up to 24 SFF flsh drives SAN Volume Controller F houses up to 12 lrge form fctor (LFF) HDD or flsh drives v Support for optionl Compression Accelertor crds for IBM Rel-time Compression v Dul redundnt power supplies v Dul redundnt btteries v A dedicted technicin port to initilize or service the system SAN Volume Controller 2147-SV1 node fetures The SAN Volume Controller 2147-SV1 system includes ll of the fetures of the SAN Volume Controller 2145-SV1 system plus Enterprise Clss Support nd three-yer wrrnty. SAN Volume Controller 2145-DH8 node fetures The SAN Volume Controller 2145-DH8 node hs the following fetures: v A 19-inch rck-mounted enclosure v At lest one Fibre Chnnel dpter or one 10 Gbps Ethernet dpter 10 SAN Volume Controller: Troubleshooting Guide

25 v Optionl second, third, nd fourth Fibre Chnnel dpters v 32 GB memory per processor v One or two, eight-core processors v Dul redundnt power supplies v Dul redundnt btteries for better relibility, vilbility, nd servicebility v SAN Volume Controller F expnsion enclosure to house up to 92 flsh drives (SFF or LFF drives) nd two secondry expnder modules v Up to two SAN Volume Controller F expnsion enclosures to house up to 24 flsh drives ech v SAN Volume Controller F expnsion enclosures to house up to 12 LFF HDD or flsh drives v iscsi host ttchment (1 Gbps Ethernet nd optionl 10 Gbps Ethernet) v Supports optionl IBM Rel-time Compression v A dedicted technicin port for locl ccess to the initiliztion tool or the service ssistnt interfce. Systems Systems re collections of nodes. Systems cn consist of between two to eight nodes. All configurtion settings re replicted cross ll nodes in the system. Mngement IP ddresses re ssigned to the system. Ech interfce ccesses the system remotely through the Ethernet system-mngement ddresses, lso known s the primry, nd secondry system IP ddresses. Configurtion node A configurtion node is single node tht mnges configurtion ctivity of the system. If the configurtion node fils, the system chooses new configurtion node. This ction is clled configurtion node filover. The new configurtion node tkes over the mngement IP ddresses. Thus, you cn ccess the system through the sme IP ddresses lthough the originl configurtion node hs filed. During the filover, there is short period when you cnnot use the commnd-line tools or mngement GUI. Figure 11 shows n exmple of clustered system tht contins four nodes. Node 1 is the configurtion node. User requests ( 1 ) re hndled by node 1. 2 Node 1 Node 2 Node 3 Node 4 1 Configurtion Node IP Interfce Figure 11. Configurtion node Chpter 1. SAN Volume Controller overview 11

26 Configurtion node ddressing At ny given time, only one node within SAN Volume Controller clustered system is ssigned n IP ddresses. An IP ddress for the clustered system must be ssigned to Ethernet port 1. An IP ddress cn lso be ssigned to Ethernet port 2. These re the only ports tht cn be ssigned mngement IP ddresses. This node then cts s the focl point for ll configurtion nd other requests tht re mde from the mngement GUI ppliction or the CLI. This node is known s the configurtion node. If the configurtion node is stopped or fils, the remining nodes in the system determine which node will tke on the role of configurtion node. The new configurtion node binds the mngement IP ddresses to its Ethernet ports. It brodcsts this new mpping so tht connections to the system configurtion interfce cn be resumed. The new configurtion node brodcsts the new IP ddress mpping using the Address Resolution Protocol (ARP). You must configure some switches to forwrd the ARP pcket on to other devices on the subnetwork. Ensure tht ll Ethernet devices re configured to pss on unsolicited ARP pckets. Otherwise, if the ARP pcket is not forwrded, device loses its connection to the SAN Volume Controller system. If device loses its connection to the SAN Volume Controller system, it cn regenerte the ddress quickly if the device is on the sme subnetwork s the system. However, if the device is not on the sme subnetwork, it might tke hours for the ddress resolution cche of the gtewy to refresh. In this cse, you cn restore the connection by estblishing commnd line connection to the system from terminl tht is on the sme subnetwork, nd then by strting secure copy to the device tht hs lost its connection. Mngement IP filover If the configurtion node fils, the IP ddresses for the clustered system re trnsferred to new node. The system services re used to mnge the trnsfer of the mngement IP ddresses from the filed configurtion node to the new configurtion node. The following chnges re performed by the system service: v If softwre on the filed configurtion node is still opertionl, the softwre shuts down the mngement IP interfces. If the softwre cnnot shut down the mngement IP interfces, the hrdwre service forces the node to shut down. v When the mngement IP interfces shut down, ll remining nodes choose new node to host the configurtion interfces. v The new configurtion initilizes the configurtion demons, including SSHD nd HTTPD, nd then binds the mngement IP interfces to its Ethernet ports. v The router is configured s the defult gtewy for the new configurtion. v The routing tbles re estblished on the new configurtion for the mngement IP ddresses. The new configurtion sends five unsolicited ddress resolution protocol (ARP) pckets for ech IP ddress to the locl subnet brodcst ddress. The ARP pckets contin the mngement IP nd the Medi Access Control (MAC) ddress for the new configurtion node. All systems tht receive ARP 12 SAN Volume Controller: Troubleshooting Guide

27 pckets re forced to updte their ARP tbles. After the ARP tbles re updted, these systems cn connect to the new configurtion node. Note: Some Ethernet devices might not forwrd ARP pckets. If the ARP pckets re not forwrded, connectivity to the new configurtion node cnnot be estblished utomticlly. To void this problem, configure ll Ethernet devices to pss unsolicited ARP pckets. You cn restore lost connectivity by logging in to the system nd strting secure copy to the ffected system. Strting secure copy forces n updte to the ARP cche for ll systems tht re connected to the sme switch s the ffected system. Ethernet link filures If the Ethernet link to the system fils becuse of n event tht is unrelted to the system, the system does not ttempt to fil over the configurtion node to restore mngement IP ccess. For exmple, the Ethernet link cn fil if cble is disconnected or n Ethernet router fils. To protect ginst this type of filure, the system provides the option for two Ethernet ports tht ech hve mngement IP ddress. If you cnnot connect through one IP ddress, ttempt to ccess the system through the lterntive IP ddress. Note: IP ddresses tht re used by hosts to ccess the system over n Ethernet connection re different from mngement IP ddresses. Routing considertions for event notifiction nd Network Time Protocol The system supports the following protocols tht mke outbound connections: v Emil v Simple Network Mil Protocol (SNMP) v Syslog v Network Time Protocol (NTP) These protocols operte only on port tht is configured with mngement IP ddress. When it is mking outbound connections, the system uses the following routing decisions: v If the destintion IP ddress is in the sme subnet s one of the mngement IP ddresses, the system sends the pcket immeditely. v If the destintion IP ddress is not in the sme subnet s either of the mngement IP ddresses, the system sends the pcket to the defult gtewy for Ethernet port 1. v If the destintion IP ddress is not in the sme subnet s either of the mngement IP ddresses nd Ethernet port 1 is not connected to the Ethernet network, the system sends the pcket to the defult gtewy for Ethernet port 2. When you configure ny of these protocols for event notifictions, use these routing decisions to ensure tht error notifiction works correctly if the network fils. Chpter 1. SAN Volume Controller overview 13

28 SAN fbric overview The SAN fbric is n re of the network tht contins routers nd switches. A SAN is configured into number of zones. A device tht uses the SAN cn communicte only with devices tht re included in the sme zones tht it is in. A system requires severl distinct types of zones: system zone, host zones, nd disk zones. The intersystem zone is optionl. In the host zone, the host systems cn identify nd ddress the nodes. You cn hve more thn one host zone nd more thn one disk zone. Unless you re using dul-core fbric design, the system zone contins ll ports from ll nodes in the system. Crete one zone for ech host Fibre Chnnel port. In disk zone, the nodes identify the storge systems. Generlly, crete one zone for ech externl storge system. If you re using the Metro Mirror nd Globl Mirror feture, crete zone with t lest one port from ech node in ech system; up to four systems re supported. Note: Some operting systems cnnot tolerte other operting systems in the sme host zone, lthough you might hve more thn one host type in the SAN fbric. For exmple, you cn hve SAN tht contins one host tht runs on n IBM AIX operting system nd nother host tht runs on Microsoft Windows operting system. All communiction between the system nodes is performed through the SAN. All of the system configurtion nd service commnds re sent to the system through n Ethernet network. 14 SAN Volume Controller: Troubleshooting Guide

29 Chpter 2. Introducing the SAN Volume Controller hrdwre components A SAN Volume Controller system consists of SAN Volume Controller nodes nd relted hrdwre components, such s uninterruptible power supply units nd the optionl redundnt AC-power switches. Note tht nodes nd uninterruptible power supply units re instlled in pirs. SAN Volume Controller nodes The system supports severl different types of models. The following nodes re supported: v The SAN Volume Controller 2145-DH8 node is vilble for purchse, with the following fetures: At lest one Fibre Chnnel dpter or one 10 Gbps Ethernet dpter Optionl second nd third Fibre Chnnel dpters Up to two SAN Volume Controller F expnsion enclosures to house optionl flsh drives iscsi host ttchment (1 Gbps Ethernet nd optionl 10 Gbps Ethernet) v The SAN Volume Controller 2145-SV1 node is vilble for purchse, with the following fetures: At lest one Fibre Chnnel dpter or one 10 Gbps Ethernet dpter Optionl second nd third Fibre Chnnel dpters Up to two SAN Volume Controller F expnsion enclosures to house optionl flsh drives iscsi host ttchment (1 Gbps Ethernet nd optionl 10 Gbps Ethernet) A lbel on the front of the node indictes the node type, hrdwre revision (if pproprite), nd seril number. Optionl fetures The SAN Volume Controller 2145-SV1 nd SAN Volume Controller 2145-DH8 nodes support optionl fetures, which cn be instlled concurrently. Tble 7. Optionl fetures nd models Fetures or models Tble 7 lists the optionl fetures nd models tht cn be instlled on your SAN Volume Controller 2145-SV1 or SAN Volume Controller 2145-DH8 system. Only n IBM service support representtive (SSR) cn remove or instll dpters on the system. Feture or model Description F SAN Volume Controller expnsion enclosure for inch SAS drive slots Minimum softwre level required Mximum per 2145-DH8 node Mximum per 2145-SV1 node Copyright IBM Corp. 2003,

30 Tble 7. Optionl fetures nd models (continued) Feture or model Description F SAN Volume Controller expnsion enclosure, which is required for 2.5-inch SAS drives Minimum softwre level required for up to two enclosures for up to two enclosures (12F, 24F, or both) Mximum per 2145-DH8 node Mximum per 2145-SV1 node F SAN Volume Controller expnsion enclosure, which is required for 3.5-inch SAS drives for up to 20 enclosures for up to two enclosures for up to two enclosures (12F, 24F, or both) AH10 AH11 AH12 AH13 AH14 AH1T ACHU AH1A 4-port 8 Gbps Fibre Chnnel dpter with four short-wve SFP trnsceivers Notes: v The mximum fiber length is 10 km when used with feture AH1T nd single-mode fiber optic cble. v If you re instlling four dpters, softwre level is required. 2-port 16 Gbps Fibre Chnnel dpter with two short-wve SFP trnsceivers The mximum fiber length is 10 km when used with feture ACHU nd single-mode fiber optic cble. 4-port 10 Gbps Ethernet dpter with four SFP trnsceivers 4-port 12 Gbps SAS dpter. Required for Model 24F ttchment 4-port 16 Gbps Fibre Chnnel dpter with four short-wve SFP trnsceivers Note: The mximum fiber length is 5 km when used with feture ACHU nd single-mode fiber optic cble. Two 8 Gbps Fibre Chnnel long-wve SFP Trnsceivers for optionl use with feture AH10 Two 16 Gbps Fibre Chnnel long-wve SFP Trnsceivers for optionl use with feture AH11 or AH14 Compression ccelertor. Requires feture AH1B for up to 20 enclosures or (Required to support two dpters.) N/A SAN Volume Controller: Troubleshooting Guide

31 Tble 7. Optionl fetures nd models (continued) Feture or model Description Minimum softwre level required Mximum per 2145-DH8 node Mximum per 2145-SV1 node AH1B Second microprocessor nd 32 GB RAM (in bse) AH20 AH21 AH22 AH23 AH24 AH30 AH31 AH32 AH GB 12 Gbps SAS 2.5-inch tier 0 flsh drive 400 GB 12 Gbps SAS 2.5-inch tier 0 flsh drive 800 GB 12 Gbps SAS 2.5-inch tier 0 flsh drive 1.6 TB 12 Gbps SAS 2.5-inch tier 0 flsh drive 3.2 TB 12 Gbps SAS 2.5-inch tier 0 flsh drive 4.0 TB 7.2 K RPM 3.5-inch nerline hrd disk drive 6.0 TB 7.2 K RPM 3.5-inch nerline hrd disk drive 8.0 TB 7.2 K RPM 3.5-inch nerline hrd disk drive 10.0 TB 7.2 K RPM 3.5-inch nerline hrd disk drive AH GB 15 K RPM 2.5-inch hrd disk drive AH GB 15 K RPM 2.5-inch hrd disk drive AH GB 10 K RPM 2.5-inch hrd disk drive AH TB 10 K RPM 2.5-inch hrd disk drive AH TB 10 K RPM 2.5-inch hrd disk drive AH TB 7.2 K RPM 2.5-inch nerline hrd disk drive AH2A 1.92 TB 2.5-inch SAS tier 1 flsh drive AH2B 3.84 TB 2.5-inch SAS tier 1 flsh drive AH2C 7.68 TB 2.5-inch SAS tier 1 flsh drive AH2D 15.3 TB 2.5-inch SAS tier 1 flsh drive AH GB 15 K RPM 2.5-inch SAS disk drive AH TB 10 K RPM 2.5-inch SAS disk drive AH70 AH73 AH GB 15 K RPM SAS hrd disk drive for 92F 1.2 TB 10 K RPM SAS hrd disk drive for 92F 1.8 TB 10 K RPM SAS hrd disk drive for 92F AH TB 10 K RPM SAS disk drive for 92F AH77 AH78 6 TB 7.2 K RPM nerline SAS hrd disk drive for 92F 8 TB 7.2 K RPM nerline SAS hrd disk drive for 92F Chpter 2. Introducing the SAN Volume Controller hrdwre components 17

32 Tble 7. Optionl fetures nd models (continued) Feture or model AH79 Description 10 TB 7.2 K RPM nerline SAS hrd disk drive for 92F Minimum softwre level required AH7D 1.6 TB SAS tier 0 flsh drive for 92F AH7E 3.2 TB SAS tier 0 flsh drive for 92F AH7J 1.92 TB SAS tier 1 flsh drive for 92F AH7K 3.84 TB SAS tier 1 flsh drive for 92F AH7L 7.68 TB SAS tier 1 flsh drive for 92F AH7M 15.3 TB SAS tier 1 flsh drive for 92F Mximum per 2145-DH8 node Mximum per 2145-SV1 node 2145-SV1 PCI express expnsion slot rules Tble 9 on pge 19 lists the optionl dpters tht re supported in ech PCI express expnsion slot. Tble 8. PCI express expnsion slot rules for 2145-SV1 nodes PCIe slot 1 None Options tht re supported in the specified slot 2 12 Gbps SAS dpter port 16 Gbps Fibre Chnnel 10 Gbps Ethernet (see Note) 4-port 16 Gbps Fibre Chnnel 10 Gbps Ethernet (see Note) 12 Gbps SAS dpter Compression ccelertor 4-port 16 Gbps Fibre Chnnel 10 Gbps Ethernet (see Note) 4-port 16 Gbps Fibre Chnnel 10 Gbps Ethernet (see Note) 8 Compression ccelertor Note: With softwre level 8.1.1, ech node cn support two 10 Gbps Ethernet dpters (see Tble 7 on pge 15). If 10 Gbps Ethernet dpter is being used for FCoE connectivity, void instlling the second 10 Gbps Ethernet dpter in PCIe expnsion slot tht is below the first dpter. When more thn one 10 Gbps Ethernet dpter is instlled, only the first four Ethernet ports tht re detected by the system nd displyed by the lsportip commnd will support FCoE. The remining ports will not support FCoE nd ny existing FCoE zones will brek. When the node is dded bck to the system cluster, you must mnully reconfigure the zoning to mke the first 10 Gbps Ethernet dpter ports visible to the host gin DH8 PCI express expnsion slot rules Use the rules in Tble 9 on pge 19 to see which dpters re supported in ech PCI express expnsion slot. 18 SAN Volume Controller: Troubleshooting Guide

33 Tble 9. PCI express expnsion slot rules for 2145-DH8 nodes PCIe slot Options tht re supported in the specified 2145-DH8 slot Gbps Fibre Chnnel 2-port 16 Gbps Fibre Chnnel 4-port 16 Gbps Fibre Chnnel 8 Gbps Fibre Chnnel 2-port 16 Gbps Fibre Chnnel 4-port 16 Gbps Fibre Chnnel 10 Gbps Ethernet (see note 1) 8 Gbps Fibre Chnnel (see note 2) 2-port 16 Gbps Fibre Chnnel 4-port 16 Gbps Fibre Chnnel 12 Gbps SAS 4 Compression ccelertor (see note 3) 5 8 Gbps Fibre Chnnel 2-port 16 Gbps Fibre Chnnel 4-port 16 Gbps Fibre Chnnel 10 Gbps Ethernet (see note 1) Notes: 6 Compression ccelertor (see note 3) 1. With softwre level 8.1.1, ech node cn support two 10 Gbps Ethernet dpters (see Tble 7 on pge 15). Note: If 10 Gbps Ethernet dpter is being used for FCoE connectivity, void instlling the second 10 Gbps Ethernet dpter in PCIe expnsion slot tht is below the first dpter. When more thn one 10 Gbps Ethernet dpter is instlled, only the first four Ethernet ports tht re detected by the system nd displyed by the lsportip commnd will support FCoE. The remining ports will not support FCoE nd ny existing FCoE zones will brek. When the node is dded bck to the system cluster, you must mnully reconfigure the zoning to mke the first 10 Gbps Ethernet dpter ports visible to the host gin. 2. An 8 Gbps Fibre Chnnel dpter in slot 3 needs minimum softwre level of If the system hs only one compression ccelertor, it cn be instlled in either slot 4 or 6. Node controls nd indictors The controls nd indictors provide informtion bout the system sttus nd ctivity. They lso help to identiy the node. SAN Volume Controller 2145-SV1 front pnel controls nd indictors The controls nd indictors on the front pnel re used for power nd to indicte informtion such s system ctivity, node filures, nd node identifiction. Figure 12 on pge 20 shows the controls nd indictors on the front pnel of the SAN Volume Controller 2145-SV1. Chpter 2. Introducing the SAN Volume Controller hrdwre components 19

34 sv Figure 12. SAN Volume Controller 2145-SV1 front pnel 1 Boot drive ctivity LED 2 Boot drive sttus LED 3 Pullout tb showing 11S seril number 4 Power-control button nd power-on LED 5 Identify LED 6 Node sttus LED 7 Node fult LED 8 Bttery sttus LED 9 Opertor-informtion pnel 10 Front USB ports Right side ltch (releses chssis to slide out on rils) 12 Drive slot fillers (no empty slots cn be used) 13 Boot drives 14 Bttery fult LED 15 Btteries 16 Left side ltch (releses chssis to slide out on rils) 17 Mchine type nd model (MTM) nd seril number Boot drive ctivity LED The green drive ctivity LED indictes one of the following conditions. Off The drive is not redy for use. Flshing The drive is in use. On The drive is redy for use, but is not in use. Boot drive sttus LED The mber drive sttus LED indictes one of the following conditions. Off The drive is in good stte or hs no power. Flshing The drive is being identified. 20 SAN Volume Controller: Troubleshooting Guide

35 On The drive filed. Bttery fult LED The mber Bttery fult LED indictes one of the following conditions. Off The bttery is functioning normlly. Flshing The bttery is being identified. On The bttery filed. SAN Volume Controller 2145-DH8 front pnel controls nd indictors The controls nd indictors on the front pnel re used for power nd to indicte informtion such s system ctivity, node filures, nd node identifiction. Figure 13 shows the controls nd indictors on the front pnel of the SAN Volume Controller 2145-DH svc00800 Figure 13. SAN Volume Controller 2145-DH8 front pnel 1 Hrd disk drive ctivity LED 2 Hrd disk drive sttus LED 3 USB port 4 Video connector 5 Opertor-informtion pnel 6 Rck relese ltch 7 Node sttus LED 8 Node fult LED 9 Bttery sttus LED 10 Bttery fult LED 11 Btteries 12 Hrd disk drives (boot drives) Chpter 2. Introducing the SAN Volume Controller hrdwre components 21

36 Node sttus LED The node sttus LED provides the following system ctivity indictors: Off On The node is not operting s member of system. The node is operting s member of system. Slow blinking The node is in cndidte or service stte. Fst blinking The node is dumping cche nd stte dt to the locl disk in nticiption of system restrt from pending power-off ction or other controlled restrt sequence. Node fult LED A node fult is indicted by the mber node-fult LED. Off On The node does not hve ny errors tht prevent it from doing I/O or the system softwre is not running on the node. The node hs n unrecoverble node error nd is not prt of the system. Bttery sttus LED The green bttery sttus LED indictes one of the following bttery conditions. Off The system softwre is not running on the node or the stte of the system cnnot be sved if power to the node is lost. Fst blinking Bttery chrge level is too low for the stte of the system to be sved if power to the node is lost. Btteries re chrging. Slow blinking Bttery chrge level is sufficient for the stte of the system to be sved once if power to the node is lost. On Bttery chrge level is sufficient for the stte of the system to be sved twice if power to the node is lost. Bttery fult LED The mber bttery fult LED indictes one of the following bttery conditions. Off The system softwre is not running on the node or this bttery does not hve fult. Blinking This bttery is being identified. On This bttery hs fult. It cnnot be used to sve the system stte if power to the node is lost. Hrd disk drive ctivity LED The green drive ctivity LED indictes one of the following conditions. Off The drive is not redy for use. 22 SAN Volume Controller: Troubleshooting Guide

37 Flshing The drive is in use. On The drive is redy for use, but is not in use. Hrd disk drive sttus LED The mber drive sttus LED indictes one of the following conditions. Off The drive is in good stte or hs no power. Blinking The drive is being identified. On The drive hs filed. Node sttus LED System ctivity is indicted through the green node-sttus LED. The node sttus LED provides the following system ctivity indictors: Off On The node is not operting s member of system. The node is operting s member of system. Slow blinking The node is in cndidte or service stte. Fst blinking The node is dumping cche nd stte dt to the locl disk in nticiption of system reboot from pending power-off ction or other controlled restrt sequence. Product seril number The node contins product seril number tht is written to the system bord hrdwre. The product seril number is lso printed on the seril number lbel tht is on the front pnel. This number is used for wrrnty nd service entitlement checking nd is included in the dt tht is sent with error reports. Remember: Do not chnge this number during the life of the product. If the system bord is replced, you must follow the system bord replcement instructions crefully nd rewrite the seril number on the system bord. Node opertor-informtion pnel The opertor-informtion pnel is locted on the front pnel of the node. SAN Volume Controller 2145-SV1 opertor-informtion pnel The opertor-informtion pnel contins buttons nd indictors such s the power-control button, nd LEDs tht provide node informtion. Figure 14 on pge 24 shows the opertor-informtion pnel for the SAN Volume Controller 2145-SV1. Chpter 2. Introducing the SAN Volume Controller hrdwre components 23

38 sv Figure 14. SAN Volume Controller 2145-SV1 opertor-informtion pnel 1 Power-control button nd power-on LED 2 Identify LED 3 Node sttus LED 4 Node fult LED 5 Bttery sttus LED Power LED The green power LED indictes one of the following power conditions. Off On One or more of the following re true: v No power is present t the power supply input. v The power supply hs filed. v The LED hs filed. The node is turned on. Blinking The node is turned off, but is still connected to power source. Power button The power button turns min power on or off for the SAN Volume Controller. v To turn on the power, press nd relese the power button. v To turn off the power, press nd relese the power button. For more informtion bout wht to check before you turn off the SAN Volume Controller node, see MAP 5350: Powering off node. Attention: When the node is opertionl nd you press nd immeditely relese the power button, the SAN Volume Controller writes its control dt to its internl disk nd then turns off. This process cn tke up to 5 minutes. Identify LED This LED blinks if the Identify button on the bck of node is pressed. The Identify LED blinks on both the front nd rer pnels. Use this feture to find specific node in the dt center. After the SAN Volume Controller system is initilized nd initil setup is completed, you cn use the Mngement GUI to identify node by mking the Identify LED on the node blink. Node sttus LED The green Node sttus LED hs the following sttes: 24 SAN Volume Controller: Troubleshooting Guide

39 Off On The SAN Volume Controller softwre is not running or cnnot communicte with this LED. This node is ctive in SAN Volume Controller system. Slow blink This node is not ctive. It hs Cndidte or Service sttus. Fst blink The node is dumping cche nd stte dt to the locl disk in nticiption of system reboot from pending power-off ction or other controlled restrt sequence. Node fult LED The yellow Node fult LED hs the following sttes: Off On No wrning or criticl error is shown in the bsebord mngement controller (BMC) event log, nd no ftl node error is reported by the SAN Volume Controller softwre. The SAN Volume Controller softwre indictes ftl node error. Blinking A wrning or criticl error is shown in the BMC event log. Bttery sttus LED The green bttery sttus LED hs the following sttes: Off On Hrdened dt is not sved if there is power loss or the SAN Volume Controller softwre is not running. Bttery chrge level is sufficient for the hrdened dt to be sved twice if power to the node is lost. Slow blink Bttery chrge level is sufficient for the hrdened dt to be sved once if power to the node is lost. Fst blink Bttery chrge level is too low for the hrdened dt to be sved if power to the node is lost. Btteries re chrging. SAN Volume Controller 2145-DH8 opertor informtion pnel The opertor-informtion pnel indictes informtion such s system bord errors, Ethernet ctivity, nd power sttus. Figure 15 on pge 26 shows the opertor-informtion pnel for the SAN Volume Controller 2145-DH8. Chpter 2. Introducing the SAN Volume Controller hrdwre components 25

40 ifs00064 Figure 15. SAN Volume Controller 2145-DH8 opertor informtion pnel 1 Power-control button nd power-on LED (green) 2 Ethernet icon 3 System-loctor button nd LED (blue) 4 Relese ltch for the light pth dignostics pnel 5 Ethernet ctivity LEDs 6 Check log LED 7 System-error LED (yellow) Note: If the node hs more thn four Ethernet ports, ctivity on ports five nd bove is not reflected on the opertor-informtion pnel Ethernet ctivity LEDs. System-error LED When it is lit, the system-error LED indictes tht system-bord error hs occurred. This mber LED lights up if the hrdwre detects n unrecoverble error tht requires new field-replceble unit (FRU). To help you isolte the fulty FRU, see MAP 5800: Light pth to help you isolte the fulty FRU. Disk drive ctivity LED When it is lit, the green disk drive ctivity LED indictes tht the disk drive is in use. Reset button If reset button is vilble on your SAN Volume Controller node, do not use it. Attention: If you use the reset button, the node restrts immeditely without the SAN Volume Controller control dt tht is being written to the disk. Service ctions re then required to mke the node opertionl gin. Power button The power button turns min power on or off for the SAN Volume Controller. To turn on the power, press nd relese the power button. You must hve pointed device, such s pen, to press the button. To turn off the power, press nd relese the power button. For more informtion bout how to turn off the SAN Volume Controller node, see MAP 5350: Powering off SAN Volume Controller node. 26 SAN Volume Controller: Troubleshooting Guide

41 Attention: When the node is opertionl nd you press nd immeditely relese the power button, the SAN Volume Controller writes its control dt to its internl disk nd then turns off. This process cn tke up to 5 minutes. If you press the power button but do not relese it, the node turns off immeditely without the SAN Volume Controller control dt tht is being written to disk. Service ctions re then required to mke the SAN Volume Controller opertionl gin. Therefore, during power-off opertion, do not press nd hold the power button for more thn 2 seconds. Power LED The green power LED indictes the power sttus of the system. The power LED hs the following properties: Off On One or more of the following re true: v No power is present t the power supply input. v The power supply hs filed. v The LED hs filed. The node is turned on. Blinking The node is turned off, but is still connected to power source. System-informtion LED When the system-informtion LED is lit, noncriticl event occurs. Check the light pth dignostics pnel nd the event log. Light pth dignostics re described in more detil in the light pth mintennce nlysis procedure (MAP). Loctor LED The SAN Volume Controller does not use the loctor LED. Ethernet-ctivity LED An Ethernet-ctivity LED beside ech Ethernet port indictes tht thesan Volume Controller node is communicting on the Ethernet network tht is connected to the Ethernet port. The opertor-informtion pnel LEDs refer to the Ethernet ports tht re mounted on the system bord. If you instll the 10 Gbps Ethernet crd on SAN Volume Controller 2145-CG8, the port ctivity is not reflected on the ctivity LEDs. Node rer-pnel indictors nd connectors The rer-pnel indictors for the node re on the bck-pnel ssembly. The externl connectors re on the node nd the power supply ssembly. SAN Volume Controller 2145-SV1 rer-pnel indictors The rer-pnel indictors consist of LEDs tht indicte the sttus of the Fibre Chnnel ports, Ethernet connection nd ctivity, power, nd electricl current. Figure 16 on pge 28 shows the rer-pnel indictors on the SAN Volume Controller 2145-SV1 bck-pnel ssembly. Chpter 2. Introducing the SAN Volume Controller hrdwre components 27

42 sv Figure 16. SAN Volume Controller 2145-SV1 rer-pnel indictors 1 AC, DC, nd power-supply fult LEDs 2 Identify button nd LED 3 Ethernet-link LED 4 Ethernet-ctivity LED SAN Volume Controller 2145-DH8 rer-pnel indictors The rer-pnel indictors consist of LEDs tht indicte the sttus of the Fibre Chnnel ports, Ethernet connection nd ctivity, power, electricl current, nd system-bord errors. Figure 17 shows the rer-pnel indictors on the SAN Volume Controller 2145-DH8 bck-pnel ssembly svc00862 Figure 17. SAN Volume Controller 2145-DH8 rer-pnel indictors 1 Ethernet-link LED 2 Ethernet-ctivity LED 3 Power, loction, nd system-error LEDs 4 AC, DC, nd power-supply error LEDs SAN Volume Controller 2145-SV1 connectors The SAN Volume Controller 2145-SV1 includes multiple externl connectors for dt, video, nd power. Figure 18 on pge 29 shows the externl connectors on the SAN Volume Controller 2145-SV1 bck pnel ssembly. 28 SAN Volume Controller: Troubleshooting Guide

43 sv Figure 18. Connectors on the rer of the SAN Volume Controller 2145-SV1 1 Power supply 1 2 Power supply 2 3 Video port 4 Seril port (not used) 5 Rer USB port 1 6 Rer USB port 2 7 Unused Ethernet port 8 10 Gbps Ethernet port Gbps Ethernet port Gbps Ethernet port 3 11 Technicin port (Ethernet) Figure 19 shows the type of connector tht is on ech power-supply ssembly. 1 2 Figure 19. Power connector 1 Neutrl 2 Ground 3 Live 3 svc00838 Note: Optionl host interfce dpters provide extr connectors for 10 Gbps Ethernet, Fibre Chnnel, or SAS. SAN Volume Controller 2145-SV1 ports used during service procedures: The SAN Volume Controller 2145-SV1 contins number of ports tht re used during service procedures. The following figure shows ports tht re used during service procedures. Chpter 2. Introducing the SAN Volume Controller hrdwre components 29

44 sv Figure 20. SAN Volume Controller 2145-SV1 service ports 1 VGA port 2 Rer USB port 1 3 Rer USB port 2 4 Technicin port (Ethernet) Any of these ports other thn the Technicin port cn be used during norml opertion. Connect device to the Technicin port only when you re directed to do so by service procedure or by your IBM service representtive. SAN Volume Controller 2145-SV1 unused ports: The SAN Volume Controller 2145-SV1 includes one Ethernet port nd one seril port tht re not used. The following figure shows the Ethernet port tht is not used during service procedures or norml opertion. This port is disbled in softwre to mke the port inctive sv Figure 21. SAN Volume Controller 2145-SV1 unused Ethernet port 1 Unused Ethernet port Although not disbled, the seril port is lso not used in norml opertion. SAN Volume Controller 2145-SV1 Fibre Chnnel nd Ethernet port numbers: Fibre Chnnel port numbers for the SAN Volume Controller 2145-SV1 vry, depending on how mny host interfce dpters re instlled, nd in which slots. Port numbers lso depend on the configurtion of the 10 Gbps Opticl Ethernet dpter. Figure 22 on pge 31 shows typicl configurtion of the SAN Volume Controller 2145-SV1 with the following dpters instlled: 30 SAN Volume Controller: Troubleshooting Guide

45 Tble 10. The PCIe expnsion slots in which n dpter cn be used PCIe expnsion slot number Adpter 1 Not used 2 12 Gbps SAS dpter 3 16 Gbps Fibre Chnnel dpter or 10 Gbps Ethernet dpter* 4 16 Gbps Fibre Chnnel dpter or 10 Gbps Ethernet dpter 5 Compression Accelertor 6 16 Gbps Fibre Chnnel dpter or 10 Gbps Ethernet dpter 7 16 Gbps Fibre Chnnel dpter or 10 Gbps Ethernet dpter 8 Compression Accelertor * Slots 3, 4, 6, nd 7 cn contin 16G FC or 10G Ethernet dpter, but only one 10 Gbps Ethernet dpter is supported. The following figure shows the physicl Fibre Chnnel port numbers when the 10 Gbps Opticl Ethernet dpter is configured for Fibre Chnnel over Ethernet (FCoE) communictions sv Figure 22. Fibre Chnnel port numbers in typicl configurtion 1-16 Fibre Chnnel ports 1-16 Figure 23 shows the Ethernet port numbers for the SAN Volume Controller 2145-SV1 when the 10 Gbps Opticl Ethernet dpter is configured for iscsi communictions sv Figure 23. Ethernet port numbers for iscsi communiction Chpter 2. Introducing the SAN Volume Controller hrdwre components 31

46 Gbps Ethernet ports Gbps opticl Ethernet ports 4-7 SAN Volume Controller 2145-DH8 connectors The SAN Volume Controller 2145-DH8 includes multiple externl connectors for dt, video, nd power. Figure 24 shows the externl connectors on the SAN Volume Controller 2145-DH8 bck pnel ssembly svc00859 Figure 24. Connectors on the rer of the SAN Volume Controller 2145-DH8 1 1 Gbps Ethernet port Gbps Ethernet port Gbps Ethernet port 3 4 Technicin port (Ethernet) 5 Power supply 2 6 Power supply 1 7 USB 6 8 USB 5 9 USB 4 10 USB 3 11 Seril 12 Video 13 Unused Ethernet port Figure 25 shows the type of connector tht is on ech power-supply ssembly Figure 25. Power connector 1 Neutrl 2 Ground svc SAN Volume Controller: Troubleshooting Guide

47 3 Live Note: Optionl host interfce dpters provide extr connectors for 10 Gbps Ethernet, Fibre Chnnel, or SAS connections. SAN Volume Controller 2145-DH8 ports used during service procedures: The SAN Volume Controller 2145-DH8 contins number of ports tht re only used during service procedures. Figure 26 shows ports tht re used only during service procedures svc00866 Figure 26. SAN Volume Controller 2145-DH8 service ports 1 Technicin port (Ethernet) 2 USB 3 3 USB 4 4 USB 5 5 USB 6 During norml opertions, none of these ports re used. Connect device to ny of these ports only when you re directed to do so by service procedure or by n IBM service representtive. SAN Volume Controller 2145-DH8 unused ports: The SAN Volume Controller 2145-DH8 includes one port tht is not used. Figure 27 shows the one port tht is not used during service procedures or norml opertion. This port is disbled in softwre to mke the port inctive svc00867 Figure 27. SAN Volume Controller 2145-DH8 unused Ethernet port 1 Unused Ethernet port Chpter 2. Introducing the SAN Volume Controller hrdwre components 33

48 Fibre Chnnel LEDs The Fibre Chnnel LEDs indicte the sttus of the Fibre Chnnel ports on the SAN Volume Controller 2145-DH8 node. The SAN Volume Controller 2145-DH8 uses two light-emitting diodes (LEDs) per Fibre Chnnel port, which re rrnged one bove the other. The LEDs re rrnged in the sme order s the ports. Figure 28 shows the loction of the LEDs svc00858 Figure 28. Fibre Chnnel LEDs 1 Link speed LEDs 2 Link ctivity LEDs The following tble lists the link sttus vlues for the Fibre Chnnel LEDs. Tble 11. Link sttus vlues for Fibre Chnnel LEDs Top LED (link speed) Bottom LED (link ctivity) Flshing indictes I/O ctivity. Link sttus Off Off Inctive Off On / Flshing Active 2 Gbps Blinking On / Flshing Active 4 Gbps On On / Flshing Active 8 Gbps Note: To ccommodte the different Fibre Chnnel speed rnges, LEDs re effectively OFF=slow, FLASHING=medium, nd ON=fst. Ethernet ctivity LED The Ethernet ctivity LED indictes tht the node is communicting with the Ethernet network tht is connected to the Ethernet port. There is set of LEDs for ech Ethernet connector. The top LED is the Ethernet link LED. When it is lit, it indictes tht there is n ctive connection on the Ethernet port. The bottom LED is the Ethernet ctivity LED. When it flshes, it indictes tht dt is being trnsmitted or received between the server nd network device. 34 SAN Volume Controller: Troubleshooting Guide

49 Ethernet link LED The Ethernet link LED indictes tht there is n ctive connection on the Ethernet port. There is set of LEDs for ech Ethernet connector. The top LED is the Ethernet link LED. When it is lit, it indictes tht there is n ctive connection on the Ethernet port. The bottom LED is the Ethernet ctivity LED. When it flshes, it indictes tht dt is being trnsmitted or received between the server nd network device. Power, loction, nd system-error LEDs The power, loction, nd system-error LEDs re housed on the rer of the SAN Volume Controller. These three LEDs re duplictes of the sme LEDs tht re shown on the front of the node. The following terms describe the power, loction, nd system-error LEDs: Power LED This is the top of the three LEDs nd indictes the following sttes: Off On One or more of the following re true: v v v No power is present t the power supply input The power supply hs filed The LED hs filed The SAN Volume Controller is powered on. Blinking The SAN Volume Controller is turned off but is still connected to power source. Loction LED This is the middle of the three LEDs nd is not used by the SAN Volume Controller. System-error LED This is the bottom of the three LEDs tht indictes tht system bord error hs occurred. The light pth dignostics provide more informtion. AC nd DC LEDs The AC nd DC LEDs indicte whether the node is receiving electricl current. AC LED The upper LED indictes tht AC current is present on the node. DC LED The lower LED indictes tht DC current is present on the node. AC, DC, nd power-supply error LEDs: The AC, DC, nd power-supply error LEDs indicte whether the node is receiving electricl current. Figure 29 on pge 36 shows the loction of the SAN Volume Controller 2145-DH8 AC, DC, nd power-supply error LEDs. Chpter 2. Introducing the SAN Volume Controller hrdwre components 35

50 svc00864 Figure 29. SAN Volume Controller 2145-DH8 AC, DC, nd power-error LEDs Ech of the two power supplies hs its own set of LEDs Indictes tht AC current is present on the node. Indictes tht DC current is present on the node. Indictes problem with the power supply. Fibre Chnnel port numbers nd worldwide port nmes Fibre Chnnel (FC) ports re identified by their physicl port number nd by worldwide port nme (WWPN). The physicl port numbers identify Fibre Chnnel dpters nd cble connections when you run service tsks. Worldwide port nmes (WWPNs), which uniquely identify the devices on the SAN, re used for tsks such s Fibre Chnnel switch configurtion. The WWPNs re derived from the worldwide node nme (WWNN) of the node in which the ports re instlled. Requirements for the SAN Volume Controller environment Certin specifictions for the physicl site of the SAN Volume Controller must be met before the IBM representtive cn set up your SAN Volume Controller environment. SAN Volume Controller 2145-SV1 environment requirements Before the SAN Volume Controller 2145-SV1 is instlled, the physicl environment must meet certin requirements. This includes verifying tht dequte spce is vilble nd tht requirements for power nd environmentl conditions re met. Input-voltge requirements Ensure tht your environment meets the voltge requirements tht re shown in Tble 12. Tble 12. Input-voltge requirements Voltge Frequency / Vc 50 Hz or 60 Hz 36 SAN Volume Controller: Troubleshooting Guide

51 Mximum power requirements for ech node Ensure tht your environment meets the power requirements s shown in Tble 13. The mximum power tht is required depends on the node type nd the optionl fetures tht re instlled. Tble 13. Power consumption Components Power requirements SAN Volume Controller 2145-SV1 ~450 W typicl, 700 W mximum ( V c, 50/60 Hz) Environment requirements without redundnt AC power Ensure tht your environment flls within the following rnges if you re not using redundnt AC power. If you re not using redundnt c power, ensure tht your environment flls within the rnges tht re shown in Tble 14. Tble 14. Physicl specifictions Environment Temperture Altitude Operting in lower ltitudes Operting in higher ltitudes Turned off (with stndby power) 5 C to 40 C (41 F to 104 F) 5 C to 28 C (41 F to 82 F) 5 C to 45 C (41 F to 113 F) Storing 1 C to 60 C (33.8 F to F) Shipping -40 C to 60 C (-40 F to F) m (0 ft to 3,117 ft) 951 m to 3,050 m (3,118 ft to 10,000 ft) 0 m to 3,050 m (0 ft to 10,000 ft) 0 m to 3,050 m (0 ft to 10,000 ft) 0 m to 10,700 m (0 ft to 34,991 ft) Reltive humidity Mximum dew point 8% to 85% 24 C (75 F) 8% to 85% 27 C (80.6 F) 5% to 80% 29 C (84.2 F) 5% to 100% 29 C (84.2 F) Note: Decrese the mximum system temperture by 1 C for every 175 m increse in ltitude. Prepring your environment The following tbles list the physicl chrcteristics of SAN Volume Controller 2145-SV1 node. Dimensions nd weight Use the prmeters tht re shown in Tble 15 on pge 38 to ensure tht spce is vilble in rck cpble of supporting the node. Chpter 2. Introducing the SAN Volume Controller hrdwre components 37

52 Tble 15. Dimensions nd weight Height Width Depth Mximum weight 87 mm (3.4 in.) 447 mm (17.6 in) 746 mm (30.1 in) 25 kg (55 lb) to 30 kg (65 lb) depending on configurtion Additionl spce requirements Ensure tht spce is vilble in the rck for the dditionl spce requirements round the node, s shown in Tble 16. Tble 16. Additionl spce requirements Loction Additionl spce requirements Reson Left side nd right side Minimum: 50 mm (2 in.) Cooling ir flow Bck Minimum: 100 mm (4 in.) If the cble-mngement rm is used, llow 177 mm (7 in.) Cble exit Mximum het output of ech SAN Volume Controller 2145-SV1 node The node dissiptes the mximum het output tht is given in Tble 17. Tble 17. Mximum het output of ech SAN Volume Controller 2145-SV1 node Model SAN Volume Controller 2145-SV1 Het output per node v Minimum configurtion: Btu per hour (AC 123 wtts) v Mximum configurtion: Btu per hour (AC 1020 wtts) SAN Volume Controller 2145-DH8 environment requirements Before the SAN Volume Controller 2145-DH8 is instlled, the physicl environment must meet certin requirements. This includes verifying tht dequte spce is vilble nd tht requirements for power nd environmentl conditions re met. Input-voltge requirements Ensure tht your environment meets the voltge requirements tht re shown in Tble 18. Tble 18. Input-voltge requirements Voltge Frequency / Vc 50 Hz or 60 Hz Mximum power requirements for ech node Ensure tht your environment meets the power requirements s shown in Tble 19 on pge SAN Volume Controller: Troubleshooting Guide

53 The mximum power tht is required depends on the node type nd the optionl fetures tht re instlled. Tble 19. Power consumption Components SAN Volume Controller 2145-DH8 Power requirements 200 W typicl, 750 W mximum ( V c, 50/60 Hz) Note: You cnnot mix c nd dc power sources; the power sources must mtch. Environment requirements without redundnt AC power Ensure tht your environment flls within the following rnges if you re not using redundnt AC power. If you re not using redundnt c power, ensure tht your environment flls within the rnges tht re shown in Tble 20. Tble 20. Physicl specifictions Environment Temperture Altitude Operting in lower ltitudes Operting in higher ltitudes Turned off (with stndby power) 5 C to 40 C (41 F to 104 F) 5 C to 28 C (41 F to 82 F) 5 C to 45 C (41 F to 113 F) Storing 1 C to 60 C (33.8 F to F) Shipping -40 C to 60 C (-40 F to F) 0 to 950 m (0 ft to 3,117 ft) 951 m to 3,050 m (3,118 ft to 10,000 ft) 0 m to 3,050 m (0 ft to 10,000 ft) 0 m to 3,050 m (0 ft to 10,000 ft) 0 m to 10,700 m (0 ft to 34,991 ft) Reltive humidity Mximum dew point 8% to 85% 24 C (75 F) 8% to 85% 27 C (80.6 F) 5% to 80% 29 C (84.2 F) 5% to 100% 29 C (84.2 F) Note: Decrese the mximum system temperture by 1 C for every 175 m increse in ltitude. Prepring your environment The following tbles list the physicl chrcteristics of the 2145-DH8 node. Dimensions nd weight Use the prmeters tht re shown in Tble 21 to ensure tht spce is vilble in rck cpble of supporting the node. Tble 21. Dimensions nd weight Height Width Depth Mximum weight 86 mm (3.4 in.) 445 mm (17.5 in) 746 mm (29.4 in) 25 kg (55 lb) to 30 kg (65 lb) depending on configurtion Chpter 2. Introducing the SAN Volume Controller hrdwre components 39

54 Additionl spce requirements Ensure tht spce is vilble in the rck for the dditionl spce requirements round the node, s shown in Tble 22. Tble 22. Additionl spce requirements Loction Additionl spce requirements Reson Left side nd right side Minimum: 50 mm (2 in.) Cooling ir flow Bck Minimum: 100 mm (4 in.) Cble exit Mximum het output of ech 2145-DH8 node The node dissiptes the mximum het output tht is given in Tble 23. Tble 23. Mximum het output of ech 2145-DH8 node Model 2145-DH8 Het output per node v Minimum configurtion: Btu per hour (AC 123 wtts) v Mximum configurtion: Btu per hour (AC 1020 wtts) Prts listing Prt numbers re vilble for the different prts nd field-replceble units (FRUs) of the nodes, expnsion enclosures, the redundnt AC-power switch, nd the uninterruptible power-supply unit. The system supports severl different types of models. A lbel on the front of the node indictes the node type, hrdwre revision (if pproprite), nd seril number. SAN Volume Controller 2145-SV1 prts The only replceble SAN Volume Controller 2145-SV1 prts re the field-replceble units (FRUs) which re replced by service support representtives (SSRs). There re no customer replceble prts (CRUs). For more informtion bout the terms of the wrrnty nd getting service nd ssistnce, see the Wrrnty nd Support Informtion document. SAN Volume Controller 2145-SV1 replceble units The following tbles identify prt numbers nd provide brief descriptions of the SAN Volume Controller 2145-SV1 prts. Tble 24. FRUs in the SAN Volume Controller 2145-SV1 prts ssembly FRU prt Number Quntity Description 01EJ624 2 Bttery 00RY volt CMOS bttery 01AF423 6 Drive slot filler 40 SAN Volume Controller: Troubleshooting Guide

55 Tble 24. FRUs in the SAN Volume Controller 2145-SV1 prts ssembly (continued) FRU prt Number Quntity Description 01EJ360 2 Intel E5-2667v4 8c 3.2 GHz 135W microprocessor 01EJ361 4, 8, 12, or GB DDR4 DIMM 01EJ GB SATA flsh drive ssembly 01EJ362 1 Bttery bckplne power cble 01EJ363 1 Bttery bckplne power sense cble 01EJ364 1 Bttery bckplne LPC cble 01EJ365 1 set Slide rils 01EJ366 1 Cble mngement rm (CMA) 01EJ367 1 Chssis metlwork kit (the enclosure without ll the other FRUs) 01EJ368 1 SV1 opertor informtion pnel 01EJ369 1 Front left er ssembly 01EJ370 1 Front right er ssembly 01EJ372 1 Opertor informtion pnel USB cble 01EJ373 1 Opertor informtion pnel LED nd power button cble 01EJ374 1 SATA drive bckplne 01EJ375 1 SATA drive bckplne power cble 01EJ376 2 SATA drive bckplne SATA cble 01EJ377 2 AC power supply unit 01EJ378 6 Fn module 01EJ379 1 Fn cge ssembly 01EJ380 1 Trusted Pltform Module (TPM) 01EJ381 1 Min bord with try 01EJ382 1 Microprocessor het sink 01EJ slot PCIe riser ssembly 01EJ slot PCIe riser ssembly 01EJ port Ethernet edge bord 01EJ387 1 Top cover, front 01EJ389 1 Top cover, bck 01LJ163 1 Bttery bckplne Chpter 2. Introducing the SAN Volume Controller hrdwre components 41

56 Tble 24. FRUs in the SAN Volume Controller 2145-SV1 prts ssembly (continued) FRU prt Number Quntity Description 00WY port 16 Gbps Fibre Chnnel dpter 00AR319 0 or 1 4 port 10 Gbps opticl Ethernet dpter 01AC573 0 or 1 12 Gbps SAS dpter 00RY Gbps long-wve SFP 31P Gbps short-wve SFP 00RY Gbps short-wve SFP 01EJ Compression ccelertor 39M m-fiber cble 39M m-fiber cble 41V m OM3 fiber cble 39M or 2 Power cord, Argentin 39M or 2 Power cord, Chicgo 39M or 2 Power cord, US/group 1 39M or 2 Power cord, Austrili/NZ 39M or 2 Power cord, Europe/Afric 39M or 2 Power cord, Denmrk 39M or 2 Power cord, South Afric 39M or 2 Power cord, EMEA 39M or 2 Power cord, Switzerlnd 39M or 2 Power cord, Chile/Itly 39M or 2 Power cord, Isrel 39M or 2 Power cord, Jpn 39M or 2 Power cord, Chin 39M or 2 Power cord, Kore 39M or 2 Power cord, Indi 39M or 2 Power cord, Brzil 39M or 2 Power cord, Tiwn 39M or 2 Power cord, PDU connection 41Y Therml grese 59P Alcohol wipes SAN Volume Controller 2145-DH8 prts The only replceble SAN Volume Controller 2145-DH8 prts re the field-replceble units (FRUs) which re replced by IBM Service Support Representtives (SSRs). No customer replceble prts (CRUs) re vilble. 42 SAN Volume Controller: Troubleshooting Guide For informtion bout the terms of the wrrnty nd getting service nd ssistnce, see the Wrrnty nd Support Informtion document.

57 dh Figure 30. SAN Volume Controller 2145-DH8 replceble prts in exploded view digrm SAN Volume Controller 2145-DH8 replceble units The following tbles identify prt numbers nd provide brief descriptions of the SAN Volume Controller 2145-DH8 prts. Use the ssembly index number to locte nd identify the prts tht re shown in Figure 30. v Tble 25 on pge 44 clls out the FRUs tht re referred to in service procedures. Chpter 2. Introducing the SAN Volume Controller hrdwre components 43

58 v Tble 26 on pge 46 clls out the FRUs tht re not referred to by ny SAN Volume Controller 2145-DH8 service procedure, but tht might be replced in some circumstnces. v Tble 27 on pge 47 clls out the FRU prts tht re required by the long-wve smll-form fctor pluggble (SFP) trnsceiver feture. Tble 25. FRUs in the SAN Volume Controller 2145-DH8 prts ssembly Figure Index FRU prt Number Quntity 1 94Y Top cover ssembly 2 94Y PCI Express riser crd ssembly. Description Ech expnsion slot might contin one of the optionl dpters. There must be t lest one Fibre Chnnel (FC) or one 10 gigbits-per-second (Gbps) Ethernet dpter in riser crd ssembly P Gbps SAS dpter (optionl). This dpter connects the SAN Volume Controller 2145-DH8 to the SAN Volume Controller F expnsion enclosure. It is instlled into PCI express expnsion slot P A 4-port 8 Gbps FC dpter (optionl). Importnt: If the system is using lterntive SFPs, replce the SFPs on the FRU prt with the SFPs from the FC dpter tht is being replced. 31P Gbps Shortwve smll form-fctor pluggble (SFP) trnsceiver. This SFP trnsceiver provides n uto-negotiting 2, 4, or 8 Gbps shortwve opticl connection on 8 Gbps FC dpter. Importnt: It is possible tht SFPs other thn those tht re shipped with the product re in use on the FC host bus dpter. It is customer responsibility to obtin replcement prts for such SFPs. The FRU prt number is shown s "Non-stndrd - supplied by customer" in the vitl product dt. 00RY port 16 Gbps FC host bus dpter (optionl). Importnt: If the system is using lterntive SFPs, replce the SFPs on the FRU prt with the SFPs from the FC dpter tht is being replced. 00WY port 16 Gbps FC dpter (optionl). Importnt: v If the system is using lterntive SFPs, replce the SFPs on the FRU prt with the SFPs from the FC dpter tht is being replced. v Before you dd this dpter, ensure tht the system is running softwre version 7.6 or lter. 00RY Gbps Shortwve smll form-fctor pluggble (SFP) trnsceiver. This SFP trnsceiver provides n uto-negotiting 2, 4, 8 or 16 Gbps shortwve opticl connection on 16 Gbps FC dpter. Importnt: It is possible tht SFPs other thn those tht re shipped with the product re in use on the FC dpter. It is the customer responsibility to obtin replcement prts for such SFPs. The FRU prt number is shown s Non-stndrd - supplied by customer in the vitl product dt. 00AR Gbps Ethernet dpter (optionl). This includes 10 Gbps Ethernet dpter tht provides connectivity for up to four 10 Gbps fiber optic Ethernet cbles. These cbles re used for Fibre Chnnel over Ethernet (FCoE) nd for iscsi communictions. 31P Gbps Shortwve SFP smll form-fctor pluggble (SFP) trnsceiver. 44 SAN Volume Controller: Troubleshooting Guide

59 Tble 25. FRUs in the SAN Volume Controller 2145-DH8 prts ssembly (continued) Figure Index FRU prt Number Quntity 00AR Compression ccelertor (optionl). 5 94Y Het sink. 6 00Y Microprocessor. 7 94Y Het sink retention module. 8 00D Memory module. Description This option ccelertes I/O between nodes nd compressed volumes. The second microprocessor nd eight memory modules must be instlled. The compression ccelertor cn be instlled only in PCI expnsion slots 4 nd W het sink for the microprocessor. When you replce this prt, you need lcohol wipes nd therml grese. Intel Xeon E5-2650V2, 2.60 GHz, 8 core, 20 MB cche, 95 W. Importnt: This prt is the microprocessor only. When replced, you must lso hve lcohol wipes nd therml grese. 8 GB, single-rnk, 1.5 V, DDR3, 1866 MHz, RDIMM. Four memory modules re instlled if there is one microprocessor. Eight memory modules re instlled if two microprocessors re vilble. 9 00AM209 1 System bord. Importnt: This prt is lso clled the plnr, nd is the system bord only. When you replce this prt, you must use the microprocessor, DIMMs, nd CMOS bttery from the system bord tht you re replcing. 33F CMOS bttery Y8114 or 94Y V. This prt mintins the system BIOS settings. 2 Power supply unit Y Sfety cover. Two power units re shown in Figure 30 on pge V AC AM393 1 Opertor-informtion pnel This ssembly includes the informtion pnel tht contins the power-control button nd dignostic LEDs. 90Y Opertor-informtion pnel cble KA089 1 DVD by EMC shield AR186 1 Tpe by EMC shield T Drive-slot blnk EMC filler ssembly WY584 1 Bezel with node LEDs. 00NV626 1 Bezel overly 17 01EJ624 2 Bttery. This prt fits over the bezel. The btteries provide temporry power to sve the write cche nd node sttus to disk if min power is lost. Two btteries re shown in Figure 30 on pge 43. Chpter 2. Introducing the SAN Volume Controller hrdwre components 45

60 Tble 25. FRUs in the SAN Volume Controller 2145-DH8 prts ssembly (continued) Figure Index FRU prt Number Quntity 18 90Y Boot disk drive. 300 GB, SAS, 2.5 inches RY001 1 Bttery bckplne. 81Y SAS signl cble. Description This prt mnges the btteries nd switches the node to bttery power if min power is lost. 820 mm, SAS. Connects the disk drive bckplne to the system bord. 81Y Disk drive bckplne configurtion cble W Disk drive bckplne. Hot-swppble, SAS, 2.5 inches. 00FK347 1 Disk nd bttery bckplne power nd emergency power off wrning (EPOW) cble. The EPOW cble is Y cble; one end connects to the system bord nd the other two ends connect to the disk drive bckplne nd the bttery bckplne. 00AR497 1 Bttery bckplne power cble. Supplied with dummy DIMMs. 00RY335 1 Bttery bckplne voltge sense cble. 00AR499 1 Bttery bckplne low-pin count (LPC) cble. 00AR496 1 Bttery bckplne LPC cble converter with clip AM212 1 Fn cge Y Fn ssembly Y Fn blnk Y Airflow bffle. This connects the bttery bckplne LPC cble to the system bord. This prt is used in ech of the 4 fn positions. Four ssemblies re shown in Figure 30 on pge 43. This prt is used in plce of fn 4 when only one microprocessor is instlled. SAN Volume Controller 2145-DH8 cble replceble units Tble 26. FRUs to which SAN Volume Controller 2145-DH8 service procedures do not refer Description FRU prt number Microprocessor instlltion tool 94Y9955 Therml grese 41Y9292 Alcohol wipes 59P4739 Support rils 94Y6719 Cble mngement rm ssembly (2U) 90Y6464 VGA cble 81Y6775 USB cble 81Y SAN Volume Controller: Troubleshooting Guide

61 Tble 26. FRUs to which SAN Volume Controller 2145-DH8 service procedures do not refer (continued) Description FRU prt number USB module 94Y6629 Power pddle crd 69Y5787 Miscellneous prts kit 94Y6746 EIA set kit 49Y5356 Bezel screws 00D m FC cble 39M m FC cble 39M5701 Ethernet Ct 5E cble 46X m jumper cble 39M5376 SAN Volume Controller 2145-DH8 SFP replceble units Tble 27. FRU prts for the long-wve smll form-fctor pluggble (SFP) trnsceiver feture Description FRU prt number Feture Code 8 Gbps Long-wve SFP trnsceiver. Importnt: It is possible tht SFP trnsceivers other thn those shipped with the product re in use on the FC host bus dpter. It is customer responsibility to obtin replcement prts for the SFP trnsceiver. The FRU prt number is shown s "Non stndrd - supplied by customer" in the vitl product dt. 16 Gbps Long-wve SFP trnsceiver (pck of 2). Importnt: It is possible tht SFP trnsceivers other thn those shipped with the product re in use on the FC host bus dpter. It is the customer responsibility to obtin replcement prts for the SFP trnsceiver. The FRU prt number is shown s Non stndrd - supplied by customer in the vitl product dt. 31P RY191 AH1T ACHU SAN Volume Controller F expnsion enclosure prts On the F expnsion enclosure, ll replceble prts re field-replceble units (FRUs). FRUs re replced by your IBM service support representtives (SSRs). The expnsion enclosure does not hve ny customer replceble prts (CRUs). Note: All of the informtion tht is listed in the following tbles for the F expnsion enclosure is lso pplicble to the F expnsion enclosure. Chpter 2. Introducing the SAN Volume Controller hrdwre components 47

62 Expnsion enclosure drives Tble 28 summrizes the types of SAS drives tht re supported by the F expnsion enclosure on SAN Volume Controller 2145-DH8 nd SAN Volume Controller 2145-SV1 systems. Tble 28. Supported expnsion enclosure SAS drives FRU description FRU prt number Feture code 600 GB 15 K hrd disk drive 01LJ061 AH TB 10 K hrd disk drive 01LJ062 AH TB 10 K hrd disk drive 01LJ063 AH74 6 TB 7.2 K Ner-Line SAS hrd disk drive 8 TB 7.2 K Ner-Line SAS hrd disk drive 10 TB 7.2 K Ner-Line SAS hrd disk drive 01LJ064 01LJ065 01LJ066 AH77 AH78 AH TB tier 0 flsh drive 01LJ073 AH7D 3.2 TB tier 0 flsh drive 01LJ074 AH7E 1.92 TB tier 1 flsh drive 01LJ075 AH7J 3.84 TB tier 1 flsh drive 01LJ076 AH7K 7.68 TB tier 1 flsh drive 01LJ077 AH7L TB tier 1 flsh drive 01LJ078 AH7M Tble 29. Other expnsion enclosure prts FRU Description Other expnsion enclosure prts Tble 29 summrizes the prt numbers nd feture codes for other prts. The vlues re the sme for ll SAN Volume Controller systems tht support the F expnsion enclosure. FRU Prt number Feture code Comments 3 m 12 Gb SAS Cble (msas HD) 00AR317 ACUC 6 m 12 Gb SAS Cble (msas HD) 00AR439 ACUD 16A power cord C19 / C20 2 m 39M5388 AHP5 Enclosure Ril kit Front fsci (4U front cover) Disply pnel ssembly 01LJ607 Note: Replces enclosure FRU P/N 01LJ LJ114 01LJ116 01LJ118 Includes the drive bord, signl interconnect bord, nd internl power cbles, in n otherwise empty enclosure. PSU fsci (1U cover) 01LJ120 The fsci must be removed to ccess the power supply units. Power supply unit (PSU) 01LJ122 The expnsion enclosure contins 2 PSUs. Ech PSU requires C19 / C20 power cord. 48 SAN Volume Controller: Troubleshooting Guide

63 Tble 29. Other expnsion enclosure prts (continued) FRU Description Secondry expnsion module FRU Prt number Feture code Comments 01LJ124 (for use with enclosure FRU P/N 01LJ112) 01LJ860 (for use with enclosure FRU P/N 01LJ607 The expnsion enclosure supports 2 secondry expnsion modules. CAUTION: Use cution when you re removing or replcing secondry expnsion module from n enclosure with FRU prt number 01LJ112. Avoid contct with the connectors on the min bord. Fn module 01LJ126 The expnsion enclosure contins 4 fn modules. Expnsion cnister 01LJ128 Cble mngement rms (CMA) 01LJ130 The FRU contins the upper nd lower CMA. Top cover Fn interfce bord 01LJ132 01LJ134 SAN Volume Controller F expnsion enclosure prts The only replceble SAN Volume Controller prts re the field-replceble units (FRUs) which re replced by your IBM service support representtives (SSRs). There re no customer replceble prts (CRUs). For informtion bout the terms of the wrrnty nd getting service nd ssistnce, refer to your product wrrnty nd support informtion. Tble 30. Expnsion enclosure field replceble units Prt Number Prt nme Notes 01AC555 Expnsion enclosure drive by with midplne ssembly, 12-slot, 3.5-inch 01AC579 Expnsion Cnister N/A 01AC404 42R Y Y2436 Expnsion enclosure power supply unit Drive blnk, 3.5-inch form fctor Expnsion enclosure left bezel Enclosure right bezel, 3.5-inch form fctor Excludes drives, drive blnks, cnisters, bezel covers, PSUs. N/A N/A No MTM/Seril number lbel on the FRU. N/A 00RY309 Expnsion enclosure ril kit N/A Tble 31. Drive field replceble units Prt Number Prt nme Notes 00AR322 4 TB Ner Line SAS hrd disk drive N/A Chpter 2. Introducing the SAN Volume Controller hrdwre components 49

64 Tble 31. Drive field replceble units (continued) Prt Number Prt nme Notes 00RX911 00WK782 6 TB NerLine SAS hrd disk drive 8 TB NerLine SAS hrd disk drive N/A N/A Tble 32. Cble field replceble SAS units Prt Number Prt nme Notes SAS 00AR311 00AR317 00AR m 12 Gbps SAS Cble (mini SAS HD to mini SAS HD) 3.0 m 12 Gbps SAS Cble (mini SAS HD to mini SAS HD) 6.0 m 12 Gbps SAS Cble (mini SAS HD to mini SAS HD) For connecting expnsion enclosures to nodes For connecting expnsion enclosures to nodes For connecting expnsion enclosures to nodes Tble 33. Cble field replceble power units Prt Number Prt nme Notes 39M5068 Argentin 2.8 m N/A 39M5199 Jpn 2.8 m N/A 39M5123 Europe 2.8 m N/A 39M5165 Itly 2.8m N/A 39M5102 Austrli/New Zelnd 2.8 m N/A 39M5130 Denmrk 2.8 m N/A 39M5144 South Afric 2.8 m N/A 39M5151 United Kingdom 2.8 m N/A 39M5158 Switzerlnd 2.8 m N/A 39M5172 Isrel 2.8m N/A 39M5206 Chin 2.8m N/A 39M5219 Kore 2.8m N/A 39M5226 Indi 2.8m N/A 39M5240 Brzil 2.8m N/A 39M5247 Tiwn 2.8m N/A 39M5081 United Sttes/Cnd 2.8m N/A 39M5377 Power jumper cord 2.8 m SAN Volume Controller F expnsion enclosure prts The only replceble SAN Volume Controller prts re the field-replceble units (FRUs) which re replced by your service support representtives (SSRs). There re no customer replceble prts (CRUs). For informtion bout the terms of the wrrnty nd getting service nd ssistnce, refer to your product wrrnty nd support informtion. 50 SAN Volume Controller: Troubleshooting Guide

65 Tble 34. Expnsion enclosure field replceble units Prt Number Prt nme Notes 64P8445 Expnsion enclosure midplne ssembly, 24-slot, 2.5-inch 01AC579 Expnsion Cnister N/A 01AC381 Expnsion enclosure power supply unit Excludes drives, drive blnks, cnisters, bezel covers, nd PSUs. N/A 45W8680 Drive blnk, 2.5-inch form fctor N/A 06Y2450 Expnsion enclosure left bezel No MTM/Seril number lbel on the FRU. 00Y2512 Enclosure right bezel, 2.5-inch form fctor N/A 00RY309 Expnsion enclosure ril kit N/A Tble 35. Smll-form fctor SAS drives field replceble units Prt Number Prt nme 31P GB tier 0 flsh drive 31P GB tier 0 flsh drive 31P GB tier 0 flsh drive 00RX TB tier 0 flsh drive 01EJ TB tier 0 flsh drive 00AR K RPM, 300 GB disk drive 00AR K RPM, 600 GB disk drive 00AR K RPM, 900 GB disk drive 00AR K RPM, 1.2 TB disk drive 00RX K RPM, 1.8 TB disk drive 00WK K RPM, 2 TB ner line SAS drive 01EJ TB tier 1 flsh drive 01EJ TB tier 1 flsh drive 01EJ TB tier 1 flsh drive 01EJ TB tier 1 flsh drive Tble 36. Cble field replceble units Prt Number Prt nme Notes SAS 00AR311 00AR317 00AR439 Power 1.5 m 12 Gbps SAS Cble (mini SAS HD to mini SAS HD) 3.0 m 12 Gbps SAS Cble (mini SAS HD to mini SAS HD) 6.0 m 12 Gbps SAS Cble (mini SAS HD to mini SAS HD) 39M5068 Argentin 2.8 m N/A 39M5081 United Sttes / Cnd 2.8 m N/A For connecting expnsion enclosures to nodes For connecting expnsion enclosures to nodes For connecting expnsion enclosures to nodes Chpter 2. Introducing the SAN Volume Controller hrdwre components 51

66 Tble 36. Cble field replceble units (continued) Prt Number Prt nme Notes 39M5102 Austrli / New Zelnd 2.8 m N/A 39M5123 Europe 2.8 m N/A 39M5130 Denmrk 2.8 m N/A 39M5144 South Afric 2.8 m N/A 39M5151 United Kingdom 2.8 m N/A 39M5158 Switzerlnd 2.8 m N/A 39M5165 Itly 2.8 m N/A 39M5172 Isrel 2.8 m N/A 39M5199 Jpn 2.8 m N/A 39M5206 Chin 2.8 m N/A 39M5219 Kore 2.8 m N/A 39M5226 Indi 2.8 m N/A 39M5240 Brzil 2.8 m N/A 39M5247 Tiwn 2.8 m N/A 39M5377 Power jumper cord 2.8 m N/A 52 SAN Volume Controller: Troubleshooting Guide

67 Chpter 3. User interfces for servicing your system Mngement GUI interfce The system provides severl user interfces to troubleshoot, recover, or mintin your system. The interfces provide vrious sets of fcilities to help resolve situtions tht you might encounter. v Use the mngement GUI to monitor nd mintin the configurtion of storge tht is ssocited with your clustered systems. v Use the service ssistnt to complete service procedures. v Use the commnd line interfce (CLI) to mnge your system. The front pnel on the node provides n lterntive service interfce. Note: The front pnel disply is replced by technicin port on some models. The mngement GUI is browser-bsed GUI for configuring nd mnging ll spects of your system. It provides extensive fcilities to help troubleshoot nd correct problems. About this tsk You use the mngement GUI to mnge nd service your system. The Monitoring > Events pnel provides ccess to problems tht must be fixed nd mintennce procedures tht step you through the process of correcting the problem. The informtion on the Events pnel cn be filtered three wys: Recommended ction (defult) Shows only the lerts tht require ttention nd hve n ssocited fix procedure. Alerts re listed in priority order nd should be fixed sequentilly by using the vilble fix procedures. For ech problem tht is selected, you cn: v Run fix procedure. v View the properties. Unfixed lerts Displys only the lerts tht re not fixed. For ech entry tht is selected, you cn: v Run fix procedure on ny lert with n error code. v Mrk n event s fixed. v Filter the entries to show them by specific minutes, hours, or dtes. v Reset the dte filter. v View the properties. Unfixed messges nd lerts Displys only the lerts nd messges tht re not fixed. For ech entry tht is selected, you cn: v Run fix procedure on ny lert with n error code. v Mrk n event s fixed. v Filter the entries to show them by specific minutes, hours, or dtes. v Reset the dte filter. Copyright IBM Corp. 2003,

68 v View the properties. Show ll Displys ll event types whether they re fixed or unfixed. For ech entry tht is selected, you cn: v Run fix procedure on ny lert with n error code. v Mrk n event s fixed. v Filter the entries to show them by specific minutes, hours, or dtes. v Reset the dte filter. v View the properties. Some events require certin number of occurrences in 25 hours before they re displyed s unfixed. If they do not rech this threshold in 25 hours, they re flgged s expired. Monitoring events re below the colesce threshold nd re usully trnsient. You cn lso sort events by time or error code. When you sort by error code, the most serious events, those with the lowest numbers, re displyed first. You cn select ny event tht is listed nd select Actions > Properties to view detils bout the event. v Recommended Actions. For ech problem tht is selected, you cn: Run fix procedure. View the properties. v Event log. For ech entry tht is selected, you cn: Run fix procedure. Mrk n event s fixed. Filter the entries to show them by specific minutes, hours, or dtes. Reset the dte filter. View the properties. When to use the mngement GUI The mngement GUI is the primry tool tht is used to service your system. Regulrly monitor the sttus of the system using the mngement GUI. If you suspect problem, use the mngement GUI first to dignose nd resolve the problem. Use the views tht re vilble in the mngement GUI to verify the sttus of the system, the hrdwre devices, the physicl storge, nd the vilble volumes. The Monitoring > Events pnel provides ccess to ll problems tht exist on the system. Use the Recommended Actions filter to disply the most importnt events tht need to be resolved. If there is service error code for the lert, you cn run fix procedure tht ssists you in resolving the problem. These fix procedures nlyze the system nd provide more informtion bout the problem. They suggest ctions to tke nd step you through the ctions tht utomticlly mnge the system where necessry. Finlly, they check tht the problem is resolved. If there is n error tht is reported, lwys use the fix procedures within the mngement GUI to resolve the problem. Alwys use the fix procedures for both system configurtion problems nd hrdwre filures. The fix procedures nlyze 54 SAN Volume Controller: Troubleshooting Guide

69 the system to ensure tht the required chnges do not cuse volumes to be inccessible to the hosts. The fix procedures utomticlly perform configurtion chnges tht re required to return the system to its optimum stte. Accessing the mngement GUI To view events, you must ccess the mngement GUI. About this tsk You must use supported web browser. For list of supported browsers, refer to the Web browser requirements to ccess the mngement GUI topic. You cn use the mngement GUI to mnge your system s soon s you hve creted clustered system. Procedure 1. Strt supported web browser nd point the browser to the mngement IP ddress of your system. The mngement IP ddress is set when the clustered system is creted. Up to four ddresses cn be configured for your use. There re two ddresses for IPv4 ccess nd two ddresses for IPv6 ccess. When the connection is successful, you will see login pnel. 2. Log on by using your user nme nd pssword. 3. When you hve logged on, select Monitoring > Events. 4. Ensure tht the events log is filtered using Recommended ctions. 5. Select the recommended ction nd run the fix procedure. 6. Continue to work through the lerts in the order suggested, if possible. Results After ll the lerts re fixed, check the sttus of your system to ensure tht it is operting s intended. Deleting node from clustered system by using the mngement GUI Remove node from system if the node fils nd is being replced with new node, or if repir cuses tht node to be unrecognizble by the system. Before you begin The cche on the selected node is flushed before the node is tken offline. In some circumstnces, such s when the system is lredy degrded (for exmple, when both nodes in the I/O group re online nd the volumes within the I/O group re degrded), the system ensures tht dt loss does not occur s result of deleting the only node with the cche dt. If filure occurs on the other node in the I/O group, the cche is flushed before the node is removed to prevent dt loss. Before you delete node from the system, record the node seril number, worldwide node nme (WWNN), ll worldwide port nmes (WWPNs), nd the I/O group tht the node is prt of. If the node is dded to the system lter, recording this node informtion now cn void dt corruption. Chpter 3. User interfces for servicing your system 55

70 Attention: v If you re removing single node nd the remining node in the I/O group is online, the dt on the remining node goes into write-through mode. This dt cn be exposed to single point of filure if the remining node fils. v If the volumes re lredy degrded before you remove node, redundncy to the volumes is degrded. Removing node might result in loss of ccess to dt nd dt loss. v Removing the lst node in the system destroys the system. Before you remove the lst node in the system, ensure tht you wnt to destroy the system. v When you remove node, you remove ll redundncy from the I/O group. As result, new or existing filures cn cuse I/O errors on the hosts. The following filures cn occur: Host configurtion errors Zoning errors Multipthing-softwre configurtion errors v If you re deleting the lst node in n I/O group nd there re volumes tht re ssigned to the I/O group, you cnnot remove the node from the system if the node is online. You must bck up or migrte ll dt tht you wnt to sve before you remove the node. If the node is offline, you cn remove the node. v When you remove the configurtion node, the configurtion function moves to different node within the system. This process cn tke short time, typiclly less thn minute. The mngement GUI rettches to the new configurtion node trnsprently. v If you turn on the power to the node tht is removed nd it is still connected to the sme fbric or zone, it ttempts to rejoin the system. The system tells the node to remove itself from the system nd the node becomes cndidte for ddition to this system or nother system. v If you re dding this node into the system, ensure tht you dd it to the sme I/O group tht it ws previously member of. Filure to do so cn result in dt corruption. This tsk ssumes tht you ccess the mngement GUI. About this tsk Complete the following steps to remove node from system: Procedure 1. Select Monitoring > System. 56 SAN Volume Controller: Troubleshooting Guide 2. Right-click the node tht you wnt to remove nd select Remove. If the node tht you wnt to remove is shown s Offline, then the node is not prticipting in the system. If the node tht you wnt to remove is shown s Online, deleting the node cn result in the dependent volumes to lso go offline. Verify whether the node hs ny dependent volumes. 3. To check for dependent volumes before you ttempt to remove the node, right-click the node nd select Show Dependent Volumes. If ny volumes re listed, determine why nd if ccess to the volumes is required while the node is removed from the system. If the volumes re ssigned from storge pools tht contin flsh drives tht re locted in the node, check why the volume mirror, if it is configured, is not synchronized. There cn lso be dependent volumes becuse the prtner node in the I/O

71 group is offline. Fbric issues cn lso prevent the volume from communicting with the storge systems. Resolve these problems before you continue with the node removl. 4. Click Remove. 5. Click Yes to remove the node. Before node is removed, the system checks to determine whether there re ny volumes tht depend on tht node. If the node tht you selected contins volumes within the following situtions, the volumes go offline nd become unvilble if the node is removed: v v The node contins flsh drives nd lso contins the only synchronized copy of mirrored volume. The other node in the I/O group is offline. If you select node to remove tht hs these dependencies, nother pnel displys confirming the removl. Adding node to system You cn dd node to the system by using the CLI or mngement GUI. A node cn be dded to the system if the node previously filed nd is being replced with new node or if repir ction cuses the node to be unrecognizble by the system. When you dd nodes, ensure tht they re dded in pirs to crete full I/O group. Adding node to the system typiclly increses the cpcity of the entire system. Adding spre nodes to system does not increse the cpcity of the system. You cn use either the mngement GUI or the commnd-line interfce to dd node to the system. Some models might require you to use the front pnel to verify tht the new node ws dded correctly. Before you dd node to system, you must mke sure tht the switch zoning is configured such tht the node tht is being dded is in the sme zone s ll other nodes in the system. If you re replcing node nd the switch is zoned by worldwide port nme (WWPN) rther thn by switch port, mke sure tht the switch is configured such tht the node tht is being dded is in the sme VSAN or zone. Note: It is recommended tht you use consistent method (either only the mngement GUI, or only the CLI) when you dd, remove, nd re-dd nodes. If node is dded by using the CLI nd lter re-dded by using the GUI, it might get different node nme thn it originlly hd. Rules nd restrictions for dding node to system If you re using hot-spre nodes, the following considertions might not ll be pplicble. For more informtion, see the topic on dding hot-spre node nd the swpnode commnd. If you re dding node tht ws used previously, either within different I/O group within this system or within different system, if you dd node without chnging its worldwide node nme (WWNN), hosts might detect the node nd use it s if it were in its old loction. This ction might cuse the hosts to ccess the wrong volumes. Chpter 3. User interfces for servicing your system 57

72 v You must ensure tht the model type of the new node is supported by the softwre level tht is instlled on the system. If the model type is not supported by the softwre level, updte the system to softwre level tht supports the model type of the new node. v Ech node in n I/O group must be connected to different uninterruptible power supply. v If you re dding node bck to the sme I/O group fter service ction required it to be deleted from the system, nd if the physicl node did not chnge, then no specil procedures re required to dd it bck to the system. v If you re replcing node in system either becuse of node filure or n updte, you must chnge the WWNN of the new node to mtch tht of the originl node before you connect the node to the Fibre Chnnel network nd dd the node to the system. v If you re dding node to the network gin, to void dt corruption, ensure tht you re dding the node to the sme I/O group from which it ws removed. You must use the informtion tht ws recorded when the node ws originlly dded to the system. If you do not hve ccess to this informtion, contct the support center for ssistnce with dding the node bck into the system so tht dt is not corrupted. v For ech externl storge system, the LUNs tht re presented to the ports on the new node must be the sme s the LUNs tht re presented to the nodes tht currently exist in the system. You must ensure tht the LUNs re the sme before you dd the new node to the system. v If you crete n I/O group in the system nd dd node, no specil procedures re needed becuse this node ws never dded to system. v If you crete n I/O group in the system nd dd node tht ws dded to system before, the host system might still be configured to the node WWPNs nd the node might still be zoned in the fbric. Becuse you cnnot chnge the WWNN for the node, you must ensure tht other components in your fbric re configured correctly. Verify tht ny host tht ws previously configured to use the node ws correctly updted. v If the node tht you re dding ws previously replced, either for node repir or updte, you might use the WWNN of tht node for the replcement node. Ensure tht the WWNN of this node ws updted so tht you do not hve two nodes with the sme WWNN ttched to your fbric. Also, ensure tht the WWNN of the node tht you re dding is not If it is 00000, contct your support representtive. v The new node must be running softwre level tht supports encryption. v If you re dding the new node to system with either HyperSwp or stretched system topology, you must ssign the node to specific site. Rules nd restrictions for using multipthing device drivers v Applictions on the host systems direct I/O opertions to file systems or logicl volumes tht re mpped by the operting system to virtul pths (vpths), which re pseudo disk objects tht re supported by the multipthing device drivers. Multipthing device drivers mintin n ssocition between vpth nd volume. This ssocition uses n identifier (UID) which is unique to the volume nd is never reused. The UID llows multipthing device drivers to directly ssocite vpths with volumes. v Multipthing device drivers operte within protocol stck tht contins disk nd Fibre Chnnel device drivers tht re used to communicte with the system by using the SCSI protocol over Fibre Chnnel s defined by the ANSI FCS stndrd. The ddressing scheme tht is provided by these SCSI nd Fibre 58 SAN Volume Controller: Troubleshooting Guide

73 Chnnel device drivers uses combintion of SCSI logicl unit number (LUN) nd the worldwide node nme (WWNN) for the Fibre Chnnel node nd ports. v If n error occurs, the error recovery procedures (ERPs) operte t vrious tiers in the protocol stck. Some of these ERPs cuse I/O to be redriven by using the sme WWNN nd LUN numbers tht were previously used. v Multipthing device drivers do not check the ssocition of the volume with the vpth on every I/O opertion tht it performs. You cn use either the ddnode commnd or the Add Node wizrd in the mngement GUI. To ccess the Add Node wizrd, select Monitoring > System. On the imge, click the new node to strt the wizrd. Complete the wizrd nd verify the new node. If the new node is not displyed in the imge, it indictes potentil cbling issue. Check the instlltion informtion to ensure tht your node ws cbled correctly. To dd node to system by using the commnd-line interfce, complete these steps: 1. Enter this commnd to verify tht the node is detected on the network: svcinfo lsnodecndidte This exmple shows the output for this commnd: # svcinfo lsnodecndidte id pnel_nme UPS_seril_number UPS_unique_id hrdwre seril_number product_mtm mchine_signture C007B00 KD0N8AM C007B00 DH8 KD0N8AM 2145-DH AB-CDEF The id prmeter displys the WWNN for the node. If the node is not detected, verify cbling to the node. 2. Enter this commnd to determine the I/O group where the node must be dded: lsiogrp 3. Record the nme or ID of the first I/O group tht hs node count of zero. You need the nme or ID for the next step. Note: You must do this step for the first node tht is dded. You do not do this step for the second node of the pir becuse it uses the sme I/O group number. 4. Enter this commnd to dd the node to the system: ddnode -wwnodenme WWNN -iogrp iogrp_nme -nme new_nme_rg -site site_nme Where WWNN is the WWNN of the node, iogrp_nme is the nme of the I/O group tht you wnt to dd the node to nd new_nme_rg is the nme tht you wnt to ssign to the node. If you do not specify new node nme, defult nme is ssigned. Typiclly, you specify meningful node nme. The site_nme specifies the nme of the site loction of the new node. This prmeter is only required if the topology is HyperSwp or stretched system. Note: Adding the node might tke considerble mount of time. 5. Record this informtion for future reference: v v v v Seril number. Worldwide node nme. All of the worldwide port nmes. The nme or ID of the I/O group Chpter 3. User interfces for servicing your system 59

74 Service ssistnt interfce The service ssistnt interfce is browser-bsed GUI tht is used to service your nodes. When to use the service ssistnt The primry use of the service ssistnt is when node is in service stte. The node cnnot be ctive s prt of system while it is in service stte. Attention: Complete service ctions on nodes only when directed to do so by the fix procedures. If used inppropritely, the service ctions tht re vilble through the service ssistnt cn cuse loss of ccess to dt or even dt loss. The node might be in service stte becuse it hs hrdwre issue, hs corrupted dt, or hs lost its configurtion dt. Use the service ssistnt in the following situtions: v When you cnnot ccess the system from the mngement GUI nd you cnnot ccess the system to run the recommended ctions v When the recommended ction directs you to use the service ssistnt. The mngement GUI opertes only when there is n online clustered system. Use the service ssistnt if you re unble to crete clustered system. The service ssistnt provides detiled sttus nd error summries, nd the bility to modify the World Wide Node Nme (WWNN) for ech node. You cn lso complete the following service-relted ctions: v Collect logs to crete nd downlod pckge of files to send to support personnel. v Remove the dt for the system from node. v Recover system if it fils. v Instll softwre pckge from the support site or rescue the softwre from nother node. v Updte softwre on nodes mnully versus completing stndrd updte procedure. v Chnge the service IP ddress tht is ssigned to Ethernet port 1 for the current node. v Instll temporry SSH key if key is not instlled nd CLI ccess is required. v Restrt the services used by the system. Accessing the service ssistnt The service ssistnt is web ppliction tht helps troubleshoot nd resolve problems on node. The service ssistnt cn be ccessed through service IP ddress. On SAN Volume Controller 2145-DH8, you cn connect to the service ssistnt by using the technicin port. About this tsk You must use supported web browser. For list of supported browsers, refer to the topic Web browser requirements to ccess the mngement GUI. 60 SAN Volume Controller: Troubleshooting Guide

75 Procedure To strt the ppliction, complete the following steps. 1. Strt supported web browser nd point your web browser to serviceddress/service for the node tht you wnt to work on. 2. Log on to the service ssistnt using the superuser pssword. If you do not know the current superuser pssword, try to find out. If you cnnot find out wht the pssword is, reset the pssword. Results Commnd-line interfce Complete the service ssistnt ctions on the correct node. Use the commnd-line interfce (CLI) to mnge system with tsk commnds nd informtion commnds. For full description of the commnds nd how to strt n SSH commnd-line session, see the Commnd-line interfce section of the SAN Volume Controller Informtion Center. When to use the CLI The system commnd-line interfce is intended for use by dvnced users who re confident t using CLI. Nerly ll of the flexibility tht is offered by the CLI is vilble through the mngement GUI. However, the CLI does not provide the fix procedures tht re vilble in the mngement GUI. Therefore, use the fix procedures in the mngement GUI to resolve the problems. Use the CLI when you require configurtion setting tht is unvilble in the mngement GUI. You might lso find it useful to crete commnd scripts tht use CLI commnds to monitor certin conditions or to utomte configurtion chnges tht you mke regulrly. Accessing the system CLI Follow the steps tht re described in the Commnd-line interfce section to initilize nd use CLI session. Service commnd-line interfce Use the service commnd-line interfce (CLI) to mnge node using the tsk commnds nd informtion commnds. Note: The service commnd line interfce cn lso be ccessed by using the technicin port. For full description of the commnds nd how to strt n SSH commnd line session, see Commnd-line interfce. When to use the service CLI The service CLI is intended for use by dvnced users who re confident t using commnd-line interfce. Chpter 3. User interfces for servicing your system 61

76 To ccess node directly, it is normlly esier to use the service ssistnt with its grphicl interfce nd extensive help fcilities. Accessing the service CLI To initilize nd use CLI session, review in the Commnd-line interfce topic of this product informtion. USB flsh drive interfce Use USB flsh drive to help service node. When USB flsh drive is inserted into one of the USB ports on node, the softwre serches for control file on the USB flsh drive nd runs the commnd tht is specified in the file. When the commnd completes, the commnd results nd node sttus informtion re written to the USB flsh drive. When to use the USB flsh drive The USB flsh drive cn be used for service functions. Using the USB flsh drive is required in the following situtions: v When you cnnot connect to node cnister in control enclosure using the service ssistnt nd you wnt to see the sttus of the node. v When you do not know, or cnnot use, the service IP ddress for the node cnister in the control enclosure nd must set the ddress. v When you hve forgotten the superuser pssword nd must reset the pssword. Using USB flsh drive Use ny USB flsh drive tht is formtted with FAT32 file system on its first prtition. About this tsk When USB flsh drive is plugged into node cnister, the node cnister code serches for text file tht is nmed stsk.txt in the root directory. If the code finds the file, it ttempts to run commnd tht is specified in the file. When the commnd completes, file tht is clled stsk_result.html is written to the root directory of the USB flsh drive. If this file does not exist, it is creted. If it exists, the dt is inserted t the strt of the file. The file contins the detils nd results of the commnd tht ws run nd the sttus nd the configurtion informtion from the node cnister. The sttus nd configurtion informtion mtches the detil tht is shown on the service ssistnt home pge pnels. The fult light-emitting diode (LED) on the nodecnister flshes when the USB service ction is being completed. When the fult LED stops flshing, it is sfe to remove the USB flsh drive. Results The USB flsh drive cn then be plugged into worksttion, nd the stsk_result.html file cn be viewed in web browser. To protect from ccidentlly running the sme commnd gin, the stsk.txt file is deleted fter it is red. 62 SAN Volume Controller: Troubleshooting Guide

77 If no stsk.txt file is found on the USB flsh drive, the result file is still creted, if necessry, nd the sttus nd configurtion dt is written to it. stsk.txt commnds If you re creting the stsk.txt commnd file by using text editor, the file must contin single commnd on single line in the file. The commnds tht you use re the sme s the service CLI commnds except where noted. Not ll service CLI commnds cn be run from the USB flsh drive. The stsk.txt commnds lwys run on the node tht the USB flsh drive is plugged into. Reset service IP ddress nd superuser pssword commnd: Use this commnd to obtin service ssistnt ccess to node cnister even if the current stte of the node cnister is unknown. The physicl ccess to the node cnister is required nd is used to uthenticte the ction. Syntx stsk chserviceip -serviceip ipv4 -gw ipv4 -msk ipv4 -resetpssword stsk chserviceip -serviceip_6 ipv6 -gw_6 ipv6 -prefix_6 int -resetpssword stsk chserviceip -defult -resetpssword Prmeters -serviceip ipv4 The IPv4 ddress for the service ssistnt. -gw ipv4 The IPv4 gtewy for the service ssistnt. -msk ipv4 The IPv4 subnet for the service ssistnt. -serviceip_6 ipv6 The IPv6 ddress for the service ssistnt. -gw_6 ipv6 The IPv6 gtewy for the service ssistnt. -prefix_6 int The IPv6 prefix for the service ssistnt. -resetpssword Sets the service ssistnt pssword to the defult vlue. Chpter 3. User interfces for servicing your system 63

78 Description This commnd resets the service ssistnt IP ddress to the defult vlue. If the node cnister is ctive in system, the superuser pssword for the system is reset; otherwise, the superuser pssword is reset on the node cnister. If the node cnister becomes ctive in system, the superuser pssword is reset to tht of the system. You cn configure the system to disble resetting the superuser pssword. If you disble tht function, this ction fils. This ction clls the stsk chserviceip commnd nd the stsk resetpssword commnd. Reset service ssistnt pssword commnd: Use this commnd when you re unble to log on to the system becuse you forget the superuser pssword, nd you wish to reset it. Syntx stsk resetpssword Prmeters None. Description This commnd resets the service ssistnt pssword to the defult vlue pssw0rd. If the node cnister is ctive in system, the superuser pssword for the system is reset; otherwise, the superuser pssword is reset on the node cnister. If the node cnister becomes ctive in system, the superuser pssword is reset to tht of the system. You cn configure the system to disble resetting the superuser pssword. If you disble tht function, this ction fils. This commnd clls the stsk resetpssword commnd. stsk snp: Use the stsk snp commnd to collect dignostic informtion from the node nd to write the output to USB flsh drive, or to uplod specified support informtion. Syntx stsk snp -dump -uplod -pmr pmr_number -noimm pnel_nme Prmeters 64 SAN Volume Controller: Troubleshooting Guide -dump (Optionl) Indictes the most recent dump file in the output. -uplod (Optionl) Specifies tht the snp file be uploded fter it is generted.

79 -pmr pmr_number (Optionl) Specifies the PMR number to use to uplod the snp file. The formt for PMR must be 13-chrcter lphnumeric string. If the specified PMR is invlid or unknown, it is uploded to generic loction on the server with the prefix: unknown_pmr_pmr_number_ If this option is not supplied, the snp file is uploded using the mchine type nd seril number ttributes. -noimm (Optionl) Indictes the /dumps/imm.ffdc file must not be included in the output. pnel_nme (Optionl) Indictes the node on which to execute the snp commnd. Description This commnd moves snp file to USB flsh drive nd uplods support informtion. If collected, the IMM FFDC file is present in the snp rchive in /dumps/imm.ffdc.<node.dumpnme>.<dte>.<time>.tgz. The system wits for up to 5 minutes for the IMM to generte its FFDC. The sttus of the IMM FFDC is locted in the snp rchive in /dumps/imm.ffdc.log. These two files re not left on the node. Specify the lsdumps commnd to view the file tht you crete. An invoction exmple stsk snp The resulting output: No feedbck Importnt: The nme of the output file (plced on the specified node) is snp.single.nodeid.dte.time.tgz. An invoction exmple stsk snp -noimm The resulting output: No feedbck An invoction exmple stsk snp -dump The resulting output: No feedbck Instll softwre commnd: Use this commnd to instll specific updte pckge on the node cnister. Chpter 3. User interfces for servicing your system 65

80 Syntx stsk instllsoftwre -file filenme -ignore -pcedccu Prmeters -file filenme (Required) The filenme designtes the nme of the updte pckge. -ignore -pcedccu (Optionl) Overrides prerequisite checking nd forces instlltion of the updte pckge. Description This commnd copies the file from the USB flsh drive to the updte directory on the node cnister, nd then instlls the updte pckge. This commnd clls the stsk instllsoftwre commnd. Crete system commnd: Use this commnd to crete storge system. Syntx stsk mkcluster -clusterip ipv4 -gw ipv4 -msk ipv4 -nme cluster_nme stsk mkcluster -clusterip_6 ipv6 -gw_6 ipv6 -prefix_6 int -nme cluster_nme Prmeters -clusterip ipv4 (Optionl) The IPv4 ddress for Ethernet port 1 on the system. -gw ipv4 (Optionl) The IPv4 gtewy for Ethernet port 1 on the system. -msk ipv4 (Optionl) The IPv4 subnet for Ethernet port 1 on the system. -clusterip_6 ipv6 (Optionl) The IPv6 ddress for Ethernet port 1 on the system. -gw_6 ipv6 (Optionl) The IPv6 gtewy for Ethernet port 1 on the system. -prefix_6 int (Optionl) The IPv6 prefix for Ethernet port 1 on the system. -nme cluster_nme (Optionl) The nme of the new system. 66 SAN Volume Controller: Troubleshooting Guide

81 Description This commnd cretes storge system. This commnd clls the stsk mkcluster commnd. Chnge system IP ddress: Use this commnd to chnge the system IP ddress of the storge system. It is best to use the initiliztion tool to crete this commnd in stsk.txt together with the ssocited clitsk.txt file tht chnges the file modules mngement IP ddresses. Syntx stsk setsystemip -systemip ipv4 -gw ipv4 -msk ipv4 -consoleip ipv4 Prmeters -systemip The IPv4 ddress for Ethernet port 1 on the system. -gw The IPv4 gtewy for Ethernet port 1 on the system. -msk The IPv4 subnet for Ethernet port 1 on the system. -consolip The mngement IPv4 ddress of SAN Volume Controller system. Description This commnd is only supported in the stsk.txt file on USB flsh drive. It clls the svctsk chsystemip commnd if the USB flsh drive is inserted in the configurtion node cnister. Otherwise, it flshes the mber identify LED of the node cnister tht is the configurtion node. If the mber identify LED for different node cnister strts to flsh, move the USB flsh drive over to tht node cnister becuse it is the configurtion node. When the mber LED turns off, you cn move the USB flsh drive to one of the file modules so tht it uses the clitsk.txt file to chnge the file module mngement IP ddresses. Leve the USB flsh drive in the file module for t lest 2 minutes before you remove it. Use worksttion to check the clitsk_results.txt nd stsk.txt results files on the USB flsh drive. If the IP ddress chnge ws successful, then you must run the strtmgtsrv -r commnd to restrt the mngement service so tht it does not continue to issue commnds to the old system IP ddress of the volume storge system. For exmple, on Linux worksttion with network ccess to the new mngement IP ddress: Chpter 3. User interfces for servicing your system 67

82 stsk setsystemip -systemip gw msk consoleip You cn now ccess the mngement GUI, which you cn use to chnge ny other IP ddress tht needs to be chnged. The following text is n exmple of wht might be in the clitsk.txt file: chnwmgt --serviceip serviceip mgtip gtewy netmsk force chstorgesystem --ip The following text is n exmple of wht might be in the stsk.txt file: stsk setsystemip -systemip gw msk consoleip Query sttus commnd: Use this commnd to determine the current service stte of the node cnister. Syntx sinfo getsttus Technicin port Prmeters None. Description This commnd writes the output from ech node cnister to the USB flsh drive. This commnd clls the sinfo lsservicenodes commnd, the sinfo lsservicesttus commnd, nd the sinfo lsservicerecommendtion commnd. The technicin port is n Ethernet port on the bck pnel of 2145-SV1 nd 2145-DH8 nodes tht you use to configure the node. You cn use the technicin port to do most of the system configurtion opertions tht re provided by the front pnel of erlier system models, which includes the following tsks: v Defining mngement IP ddress. v Initilizing new system. v Servicing the system. To use the technicin port, plug one end of n Ethernet cble into the technicin port. Then, plug the other end into the Ethernet port of personl computer with Dynmic Host Configurtion Protocol (DHCP) configured nd web browser tht is instlled. Run the system configurtion tool by going to ddress with your browser. If you do not hve DCHP, open supported browser nd go to the defult sttic IP ddress for the node. Note: When your personl computer is configured with DHCP, the technicin port uses DHCP to reconfigure network services on your personl computer. Softwre 68 SAN Volume Controller: Troubleshooting Guide

83 on your personl computer tht ws using these services might experience network problems while it is connected to the technicin port. For exmple, selecting link in web pge tht ws loded before you connect to the technicin port might result in n error messge SV1 node On the bck of 2145-SV1 node, the technicin port is on the bottom right side of the node. Figure 31 shows the loction of the technicin port nd other ports tht re used to service the node sv Figure SV1 technicin port 1 VGA port 2 Rer USB port 1 3 Rer USB port 2 4 Technicin port (Ethernet) 2145-DH8 node Strting from the left t the rer of the SAN Volume Controller 2145-DH8 node, the technicin port is the fourth Ethernet port to the right. Figure 32 shows the rer of the SAN Volume Controller node, where 1 is the technicin port svc00873 Figure DH8 technicin port Chpter 3. User interfces for servicing your system 69

84 70 SAN Volume Controller: Troubleshooting Guide

85 Chpter 4. Performing recovery ctions using the SAN Volume Controller CLI The SAN Volume Controller commnd-line interfce (CLI) is collection of commnds tht you cn use to mnge SAN Volume Controller clusters. See the Commnd-line interfce documenttion for the specific detils bout the commnds provided here. Vlidting nd repiring mirrored volume copies by using the CLI You cn use the repirvdiskcopy commnd from the commnd-line interfce (CLI) to vlidte nd repir mirrored volume copies. Attention: Run the repirvdiskcopy commnd only if ll volume copies re synchronized. When you issue the repirvdiskcopy commnd, you must use only one of the -vlidte, -medium, or -resync prmeters. You must lso specify the nme or ID of the volume to be vlidted nd repired s the lst entry on the commnd line. After you issue the commnd, no output is displyed. -vlidte Use this prmeter only if you wnt to verify tht the mirrored volume copies re identicl. If ny difference is found, the commnd stops nd logs n error tht includes the logicl block ddress (LBA) nd the length of the first difference. You cn use this prmeter, strting t different LBA ech time to count the number of differences on volume. -medium Use this prmeter to convert sectors on ll volume copies tht contin different contents into virtul medium errors. Upon completion, the commnd logs n event, which indictes the number of differences tht were found, the number tht were converted into medium errors, nd the number tht were not converted. Use this option if you re unsure wht the correct dt is, nd you do not wnt n incorrect version of the dt to be used. -resync Use this prmeter to overwrite contents from the specified primry volume copy to the other volume copy. The commnd corrects ny differing sectors by copying the sectors from the primry copy to the copies tht re being compred. Upon completion, the commnd process logs n event, which indictes the number of differences tht were corrected. Use this ction if you re sure tht either the primry volume copy dt is correct or tht your host pplictions cn hndle incorrect dt. -strtlb lb Optionlly, use this prmeter to specify the strting Logicl Block Address (LBA) from which to strt the vlidtion nd repir. If you previously used the vlidte prmeter, n error ws logged with the LBA where the first difference, if ny, ws found. Reissue repirvdiskcopy with tht LBA to void reprocessing the initil sectors tht compred identiclly. Continue to reissue repirvdiskcopy by using this prmeter to list ll the differences. Copyright IBM Corp. 2003,

86 Issue the following commnd to vlidte nd, if necessry, utomticlly repir mirrored copies of the specified volume: repirvdiskcopy -resync -strtlb 20 vdisk8 Notes: 1. Only one repirvdiskcopy commnd cn run on volume t time. 2. After you strt the repirvdiskcopy commnd, you cnnot use the commnd to stop processing. 3. The primry copy of mirrored volume cnnot be chnged while the repirvdiskcopy -resync commnd is running. 4. If there is only one mirrored copy, the commnd returns immeditely with n error. 5. If copy tht is being compred goes offline, the commnd is hlted with n error. The commnd is not utomticlly resumed when the copy is brought bck online. 6. In the cse where one copy is redble but the other copy hs medium error, the commnd process utomticlly ttempts to fix the medium error by writing the red dt from the other copy. 7. If no differing sectors re found during repirvdiskcopy processing, n informtionl error is logged t the end of the process. Checking the progress of vlidtion nd repir of volume copies by using the CLI Use the lsrepirvdiskcopyprogress commnd to disply the progress of mirrored volume vlidtion nd repirs. You cn specify volume copy by using the -copy id prmeter. To disply the volume tht hs two or more copies with n ctive tsk, specify the commnd with no prmeters; it is not possible to hve only one volume copy with n ctive tsk. To check the progress of vlidtion nd repir of mirrored volumes, issue the following commnd: lsrepirvdiskcopyprogress delim : The following exmple shows how the commnd output is displyed: vdisk_id:vdisk_nme:copy id:tsk:progress:estimted_completion_time 0:vdisk0:0:medium:50: :vdisk0:1:medium:50: Repiring thin-provisioned volume using the CLI You cn use the repirsevdiskcopy commnd from the commnd-line interfce to repir the metdt on thin-provisioned volume. The repirsevdiskcopy commnd utomticlly detects nd repirs corrupted metdt. The commnd holds the volume offline during the repir, but does not prevent the disk from being moved between I/O groups. If repir opertion completes successfully nd the volume ws previously offline becuse of corrupted metdt, the commnd brings the volume bck online. The only limit on the number of concurrent repir opertions is the number of volume copies in the configurtion. 72 SAN Volume Controller: Troubleshooting Guide

87 When you issue the repirsevdiskcopy commnd, you must specify the nme or ID of the volume to be repired s the lst entry on the commnd line. Once strted, repir opertion cnnot be pused or cnceled; the repir cn be terminted only by deleting the copy. Attention: Use this commnd only to repir thin-provisioned volume tht hs reported corrupt metdt. Issue the following commnd to repir the metdt on thin-provisioned volume: repirsevdiskcopy vdisk8 After you issue the commnd, no output is displyed. Notes: 1. Becuse the volume is offline to the host, ny I/O tht is submitted to the volume while it is being repired fils. 2. When the repir opertion completes successfully, the corrupted metdt error is mrked s fixed. 3. If the repir opertion fils, the volume is held offline nd n error is logged. Checking the progress of the repir of thin-provisioned volume by using the CLI Issue the lsrepirsevdiskcopyprogress commnd to list the repir progress for thin-provisioned volume copies of the specified volume. If you do not specify volume, the commnd lists the repir progress for ll thin-provisioned copies in the system. Note: Run this commnd only fter you run the repirsevdiskcopy commnd, which you must run only s required by the fix procedures tht re recommended by your support tem. Recovering offline volumes using the CLI If node or n I/O group fils, you cn use the commnd-line interfce (CLI) to recover offline volumes. About this tsk If you lose both nodes in n I/O group, you lose ccess to ll volumes tht re ssocited with the I/O group. To regin ccess to the volumes, you must perform one of the following procedures. Depending on the filure type, you might hve lost dt tht ws cched for these volumes nd the volumes re now offline. Dt loss scenrio 1 One node in n I/O group filed nd filover strted on the second node. During the filover process, the second node in the I/O group fils before the dt in the write cche is flushed to the bckend. The first node is successfully repired but its hrdened dt is not the most recent version tht is committed to the dt store, therefore, it cnnot be used. The second node is repired or replced nd lost its hrdened dt, nd the node hs no wy of recognizing tht it is prt of the system. Chpter 4. Performing recovery ctions using the SAN Volume Controller CLI 73

88 Complete the following steps to recover n offline volume when one node hs down-level hrdened dt nd the other node loses hrdened dt. Procedure 1. Recover the node nd dd it bck into the system. 2. Delete ll IBM FlshCopy mppings nd Metro Mirror or Globl Mirror reltionships tht use the offline volumes. 3. Run the recovervdisk, recovervdiskbyiogrp or recovervdiskbysystem commnd. 4. Re-crete ll FlshCopy mppings nd Metro Mirror or Globl Mirror reltionships tht use the volumes. Exmple Dt loss scenrio 2 Both nodes in the I/O group filed nd hve been repired. Therefore, the nodes tht lost their hrdened dt nd hve no wy of recognizing tht they re prt of the system. Complete the following steps to recover n offline volume when both nodes tht hve lost their hrdened dt nd cnnot be recognized by the system. 1. Delete ll FlshCopy mppings nd Metro Mirror or Globl Mirror reltionships tht use the offline volumes. 2. Run the recovervdisk, recovervdiskbyiogrp or recovervdiskbysystem commnd. 3. Re-crete ll FlshCopy mppings nd Metro Mirror or Globl Mirror reltionships tht use the volumes. 74 SAN Volume Controller: Troubleshooting Guide

89 Chpter 5. Viewing the vitl product dt Vitl product dt (VPD) is informtion tht uniquely records ech element in the SAN Volume Controller. The dt is updted utomticlly by the system when the configurtion is chnged. The VPD lists the following types of informtion: v System-relted vlues such s the softwre version, spce in storge pools, nd spce llocted to volumes. v Node-relted vlues tht include the specific hrdwre tht is instlled in ech node. Exmples include the FRU prt number for the system bord nd the level of BIOS firmwre tht is instlled. The node VPD is held by the system tht mkes it possible to get most of the VPD for the nodes tht re powered off. Using different sets of commnds, you cn view the system VPD nd the node VPD. You cn lso view the VPD through the mngement GUI. Downloding the vitl product dt using the mngement GUI You cn downlod the vitl product dt for node from the mngement GUI. Procedure 1. In the mngement GUI, select Monitoring > System. 2. From the dynmic grphic of the system, select the node nd click the icon to the right of the Actions menu to downlod VPD informtion. Displying the vitl product dt using the CLI You cn use the commnd-line interfce (CLI) to disply the system or node vitl product dt (VPD). Issue the following CLI commnds to disply the VPD: sinfo lsservicesttus lsnodehw lsnodevpd nodenme lssystem system_nme lssystemip lsdrive Displying node properties by using the CLI You cn use the commnd-line interfce (CLI) to disply node properties. About this tsk To disply the node properties: Procedure 1. Use the lsnode CLI commnd to disply concise list of nodes in the clustered system. Issue this CLI commnd to list the system nodes: lsnode -delim : Copyright IBM Corp. 2003,

90 2. Issue the lsnode CLI commnd nd specify the node ID or nme of the node tht you wnt to receive detiled output. The following exmple is CLI commnd tht you cn use to list detiled output for node in the system: lsnode -delim : group1node1 Where group1node1 is the nme of the node for which you wnt to view detiled output. Displying clustered system properties by using the CLI You cn use the commnd-line interfce (CLI) to disply the properties for clustered system (system). About this tsk These ctions help you disply your system property informtion. Procedure Issue the lssystem commnd to disply the properties for system. The following commnd is n exmple of the lssystem commnd you cn issue: lssystem -delim : build1 where build1 is the nme of the system. 76 SAN Volume Controller: Troubleshooting Guide

91 Results id: a00a0fe nme:build1 loction:locl prtnership: bndwidth: totl_mdisk_cpcity:90.7gb spce_in_mdisk_grps:90.7gb spce_llocted_to_vdisks:14.99gb totl_free_spce:75.7gb sttistics_sttus:on sttistics_frequency:15 required_memory:0 cluster_locle:en_us time_zone:522 UTC code_level: (build ) FC_port_speed:2Gb console_ip: :443 id_lis: a00a0fe gm_link_tolernce:300 gm_inter_cluster_dely_simultion:0 gm_intr_cluster_dely_simultion:0 emil_reply: emil_contct: emil_contct_primry: emil_contct_lternte: emil_contct_loction: emil_stte:stopped inventory_mil_intervl:0 totl_vdiskcopy_cpcity:15.71gb totl_used_cpcity:13.78gb totl_overlloction:17 totl_vdisk_cpcity:11.72gb cluster_ntp_ip_ddress: cluster_isns_ip_ddress: iscsi_uth_method:none iscsi_chp_secret: uth_service_configured:no uth_service_enbled:no uth_service_url: uth_service_user_nme: uth_service_pwd_set:no uth_service_cert_set:no reltionship_bndwidth_limit:25 gm_mx_host_dely:5 tier:generic_ssd tier_cpcity:0.00mb tier_free_cpcity:0.00mb tier:generic_hdd tier_cpcity:90.67gb tier_free_cpcity:75.34gb emil_contct2: emil_contct2_primry: emil_contct2_lternte: totl_llocted_extent_cpcity:16.12gb Fields for the node VPD The node vitl product dt (VPD) provides informtion for items such s the system bord, btteries, processor, fns, memory module, dpter, devices, softwre, front pnel ssembly, seril-ttched SCSI (SAS) flsh drive nd SAS host bus dpter (HBA). Tble 37 on pge 78 shows the fields tht you see for the system bord. Chpter 5. Viewing the vitl product dt 77

92 Tble 37. Fields for the system bord Item System bord Field nme Prt number System seril number Number of processors Number of memory slots Number of fns Number of Fibre Chnnel dpters Number of SCSI, IDE, SATA, or SAS devices Number of compression ccelertor dpters Number of power supplies Number of high-speed SAS dpters BIOS mnufcturer BIOS version BIOS relese dte System mnufcturer System product Plnr mnufcturer Power supply prt number CMOS bttery prt number Power cble ssembly prt number Service processor firmwre SAS controller prt number Tble 38 shows the fields tht you see for the btteries. Tble 38. Fields for the btteries Item Btteries Field nme Bttery_FRU_prt Bttery_prt_identity Bttery_fult_led Bttery_chrging_sttus Bttery_cycle_count Bttery_power_on_hours Bttery_lst_recondition Bttery_midplne_FRU_prt Bttery_midplne_prt_identity Bttery_midplne_FW_version Bttery_power_cble_FRU_prt Bttery_power_sense_cble_FRU_prt Bttery_comms_cble_FRU_prt Bttery_EPOW_cble_FRU_prt 78 SAN Volume Controller: Troubleshooting Guide

93 Tble 39 shows the fields tht you see for ech processor tht is instlled. Tble 39. Fields for the processors Item Processor Field nme Prt number Processor loction Mnufcturer Version Speed Sttus Processor seril number Tble 40 shows the fields tht you see for ech fn tht is instlled. Tble 40. Fields for the fns Item Fn Field nme Prt number Loction Tble 41 shows the fields tht re repeted for ech instlled memory module. Tble 41. Fields tht re repeted for ech instlled memory module Item Field nme Memory module Prt number Device loction Bnk loction Size (MB) Mnufcturer (if vilble) Seril number (if vilble) Tble 42 shows the fields tht re repeted for ech instlled dpter. Tble 42. Fields tht re repeted for ech dpter tht is instlled Item Field nme Adpter Adpter type Prt number Port numbers Loction Device seril number Mnufcturer Device Adpter revision Chip revision Chpter 5. Viewing the vitl product dt 79

94 Tble 43 shows the fields tht re repeted for ech device tht is instlled. Tble 43. Fields tht re repeted for ech SCSI, IDE, SATA, nd SAS device tht is instlled Item Field nme Device Prt number Bus Device Model Revision Seril number Approximte cpcity Hrdwre revision Mnufcturer Tble 44 shows the fields tht re specific to the node softwre. Tble 44. Fields tht re specific to the node softwre Item Softwre Field nme Code level Node nme Worldwide node nme ID Unique string tht is used in dump file nmes for this node Tble 45 shows the fields tht re provided for the front pnel ssembly. Tble 45. Fields tht re provided for the front pnel ssembly Item Field nme Front pnel Prt number Front pnel ID Front pnel locle Tble 46 shows the fields tht re provided for the Ethernet port. Tble 46. Fields tht re provided for the Ethernet port Item Ethernet port Field nme Port number Ethernet port sttus MAC ddress Supported speeds Tble 47 on pge 81 shows the fields tht re provided for the power supplies in the node. 80 SAN Volume Controller: Troubleshooting Guide

95 Tble 47. Fields tht re provided for the power supplies in the node Item Field nme Power supplies Prt number Loction Tble 48 shows the fields tht re provided for the SAS host bus dpter (HBA). Tble 48. Fields tht re provided for the SAS host bus dpter (HBA) Item Field nme SAS HBA Prt number Port numbers Device seril number Mnufcturer Device Adpter revision Chip revision Tble 49 shows the fields tht re provided for the SAS flsh drive. Tble 49. Fields tht re provided for the SAS flsh drive Item SAS SSD Field nme Prt number Mnufcturer Device seril number Model Type UID Firmwre Slot FPGA firmwre Speed Cpcity Expnsion try Connection type Tble 50 on pge 82 shows the fields tht re provided for the smll form fctor pluggble (SFP) trnsceiver. Chpter 5. Viewing the vitl product dt 81

96 Tble 50. Fields tht re provided for the smll form fctor pluggble (SFP) trnsceiver Item Field nme Smll form fctor pluggble (SFP) trnsceiver Prt number Mnufcturer Device Seril number Supported speeds Connector type Trnsmitter type Wvelength Mximum distnce by cble type Hrdwre revision Port number Worldwide port nme Fields for the system VPD The system vitl product dt (VPD) provides vrious informtion bout the system, including its ID, nme, loction, IP ddress, emil contct, code level, nd totl free spce. Tble 51 shows the fields tht re provided for the system properties s shown by the mngement GUI. Tble 51. Fields tht re provided for the system properties Item Generl IP Addresses 1 Field nme ID Note: This vlue is the unique identifier for the system. Nme Loction Time Zone Required Memory Licensed Code Version Chnnel Port Speed Ethernet Port 1 (ttributes for both IPv4 nd IPv6) v v v v v IP Address Service IP Address Subnet Msk Prefix Defult Gtewy Ethernet Port 2 (ttributes for both IPv4 nd IPv6) v v v v v IP Address Service IP Address Subnet Msk Prefix Defult Gtewy 82 SAN Volume Controller: Troubleshooting Guide

97 Tble 51. Fields tht re provided for the system properties (continued) Item Field nme Remote Authentiction Remote Authentiction Web Address User Nme Pssword SSL Certificte Spce Totl MDisk Cpcity Spce in Storge Pools Spce Allocted to Volumes Totl Free Spce Totl Used Cpcity Totl Alloction Totl Volume Copy Cpcity Totl Volume Cpcity Sttistics Sttistics Sttus Sttistics Frequency Metro nd Globl Mirror Link Tolernce Intersystem Dely Simultion Intrsystem Dely Simultion Prtnership Bndwidth Emil SMTP Emil Server Emil Server Port Reply Emil Address Contct Person Nme Primry Contct Phone Number Alternte Contct Phone Number Physicl Loction of the System Reporting Error Emil Sttus Inventory Emil Intervl iscsi isns Server Address Supported Authentiction Methods CHAP Secret 1 You cn lso use the lssystemip CLI commnd to view this dt. Chpter 5. Viewing the vitl product dt 83

98 84 SAN Volume Controller: Troubleshooting Guide

99 Chpter 6. Dignosing problems You cn dignose problems with the control nd indictors, the commnd-line interfce (CLI), the mngement GUI, or the Service Assistnt GUI. The dignostic LEDs on the SAN Volume Controller nodes nd uninterruptible power supply units lso help you dignose hrdwre problems. Event logs By understnding the event log, you cn do the following tsks: v Mnge the event log v View the event log v Describe the fields in the event log Error codes The following topics provide informtion to help you understnd nd process the error codes: v Event reporting v Understnding the events v Understnding the error codes v Determining hrdwre boot filure If the node is showing boot messge, filure messge, or node error messge, nd you determined tht the problem ws cused by softwre or firmwre filure, you cn restrt the node to see whether tht might resolve the problem. Perform the following steps to properly shut down nd restrt the node: 1. Follow the instructions in MAP 5350: Powering off node on pge Restrt only one node t time. Strting sttistics collection 3. Do not shut down the second node in n I/O group for t lest 30 minutes fter you shut down nd restrt the first node. The system collects sttistics over n intervl nd cretes files tht cn be viewed. Introduction For ech collection intervl, the mngement GUI cretes four sttistics files: one for mnged disks (MDisks), nmed Nm_stt; one for volumes nd volume copies, which re nmed Nv_stt; one for nodes, which re nmed Nn_stt; nd one for SAS drives, nmed Nd_stt. The files re written to the /dumps/iostts directory on the node. To retrieve the sttistics files from the non-configurtion nodes onto the configurtion node, svctsk cpdumps commnd must be used. A mximum of 16 files of ech type cn be creted for the node. When the 17th file is creted, the oldest file for the node is overwritten. Copyright IBM Corp. 2003,

100 Fields The following fields re vilble for user definition: Intervl Specify the intervl in minutes between the collection of sttistics. You cn specify 1-60 minutes in increments of 1 minute. Tbles The following tbles describe the informtion tht is reported for individul nodes nd volumes. Tble 52 describes the sttistics collection for MDisks, for individul nodes. Tble 52. Sttistics collection for individul nodes Sttistic nme id idx rb re ro rq wb we wo wq Description Indictes the nme of the MDisk for which the sttistics pply. Indictes the identifier of the MDisk for which the sttistics pply. Indictes the cumultive number of blocks of dt tht is red (since the node strted running). Indictes the cumultive red externl response time in milliseconds for ech MDisk. The cumultive response time for disk reds is clculted by strting timer when SCSI red commnd is issued nd stopped when the commnd completes successfully. The elpsed time is dded to the cumultive counter. Indictes the cumultive number of MDisk red opertions tht re processed (since the node is running). Indictes the cumultive red queued response time in milliseconds for ech MDisk. This response is mesured from bove the queue of commnds to be sent to n MDisk becuse the queue depth is lredy full. This clcultion includes the elpsed time tht is tken for red commnds to complete from the time they join the queue. Indictes the cumultive number of blocks of dt written (since the node is running). Indictes the cumultive write externl response time in milliseconds for ech MDisk. The cumultive response time for disk writes is clculted by strting timer when SCSI write commnd is issued nd stopped when the commnd completes successfully. The elpsed time is dded to the cumultive counter. Indictes the cumultive number of MDisk write opertions tht re processed (since the node is running). Indictes the cumultive write queued response time in milliseconds for ech MDisk. This time is mesured from bove the queue of commnds to be sent to n MDisk becuse the queue depth is lredy full. This clcultion includes the elpsed time tht is tken for write commnds to complete from the time they join the queue. Tble 53 on pge 87 describes the VDisk (volume) informtion tht is reported for individul nodes. Note: MDisk sttistics files for nodes re written to the /dumps/iostts directory on the individul node. 86 SAN Volume Controller: Troubleshooting Guide

101 Tble 53. Sttistic collection for volumes for individul nodes Sttistic nme id idx rb rl rlw ro ub ul ulw uo uou wb wl wlw wo wou xl Indictes the volume nme for which the sttistics pply. Indictes the volume for which the sttistics pply. Indictes the cumultive number of blocks of dt red (since the node is running). Indictes the cumultive red response time in milliseconds for ech volume. The cumultive response time for volume reds is clculted by strting timer when SCSI red commnd is received nd stopped when the commnd completes successfully. The elpsed time is dded to the cumultive counter. Indictes the worst red response time in microseconds for ech volume since the lst time sttistics were collected. This vlue is reset to zero fter ech sttistics collection smple. Indictes the cumultive number of volumes red opertions tht re processed (since the node strted running). Indictes the cumultive number of blocks of dt unmpped (since the node is running). Indictes the cumultive unmp response time in milliseconds for ech volume. The cumultive response time for volume unmps is clculted by strting timer when SCSI unmp commnd is received nd stopped when the commnd completes successfully. The elpsed time is dded to the cumultive counter. Indictes the worst unmp response time in milliseconds for ech volume. The worst response time for volume unmps is clculted by strting timer when SCSI unmp commnd is received nd stopped when the commnd completes successfully. Indictes the cumultive number of volume unmp opertions processed (since the node is running). Indictes the cumultive number of volume unmp opertions tht re not ligned on n 8K boundry (s per the lignment/grnulrity setting in Block Limits VPD Pge (0xb0). Indictes the cumultive number of blocks of dt written (since the node is running). Indictes the cumultive write response time in milliseconds for ech volume. The cumultive response time for volume writes is clculted by strting timer when SCSI write commnd is received nd stopped when the commnd completes successfully. The elpsed time is dded to the cumultive counter. Indictes the worst write response time in microseconds for ech volume since the lst time sttistics were collected. This vlue is reset to zero fter ech sttistics collection smple. Indictes the cumultive number of volumes write opertions tht re processed (since the node is running). Indictes the cumultive number of volumes write opertions tht re not ligned on 4K boundry. Indictes the cumultive red nd write dt trnsfer response time in milliseconds for ech volume since the lst time the node ws reset. When this sttistic is viewed for multiple volumes nd with other sttistics, it cn indicte whether the ltency is cused by the host, fbric, or the SAN Volume Controller. Chpter 6. Dignosing problems 87

102 Note: For unmp sttistics, it is where n unmp opertion is SCSI unmp or Write sme with unmp commnd. Tble 54 describes the VDisk informtion tht is relted to Metro Mirror or Globl Mirror reltionships tht is reported for individul nodes. Tble 54. Sttistic collection for volumes tht re used in Metro Mirror nd Globl Mirror reltionships for individul nodes Sttistic nme gwl gwo gwot gws Description Indictes cumultive secondry write ltency in milliseconds. This sttistic ccumultes the cumultive secondry write ltency for ech volume. You cn clculte the mount of time to recovery from filure bsed on this sttistic nd the gws sttistics. Indictes the totl number of overlpping volume writes. An overlpping write is when the logicl block ddress (LBA) rnge of write request collides with nother outstnding request to the sme LBA rnge nd the write request is still outstnding to the secondry site. Indictes the totl number of fixed or unfixed overlpping writes. When ll nodes in ll clusters re t system version 4.3.1, this records the totl number of write I/O requests received by the Globl Mirror feture on the primry tht overlpped. When ny nodes in either cluster re running system version erlier thn 4.3.1, this vlue does not increment. Indictes the totl number of write requests tht re issued to the secondry site. Tble 55 describes the port informtion tht is reported for individul nodes Tble 55. Sttistic collection for node ports Sttistic nme bbcz cbr cbt cer cet dtdc dtdm dtdt hbr hbt her het icrc id Description Indictes the totl time in microseconds for which the buffer credit counter ws t zero. Tht this sttistic is only reported by 8 Gbps Fibre Chnnel ports. For other port types, this is 0. Indictes the bytes received from controllers. Indictes the bytes trnsmitted to disk controllers. Indictes the commnds tht re received from disk controllers. Indictes the commnds tht re initited to disk controllers. Indictes the number of trnsfers tht experienced excessive dt trnsmission dely. Indictes the number of trnsfers tht hd their dt trnsmission dely mesured. Indictes the totl time in microseconds for which dt trnsmission ws excessively delyed. Indictes the bytes received from hosts. Indictes the bytes trnsmitted to hosts. Indictes the commnds tht re received from hosts. Indictes the commnds tht re initited to hosts. Indictes the number of CRC tht is not vlid. Indictes the port identifier for the node. 88 SAN Volume Controller: Troubleshooting Guide

103 Tble 55. Sttistic collection for node ports (continued) Sttistic nme itw lf lnbr lnbt lner lnet lsi lsy pspe rmbr rmbt rmer rmet wwpn Description Indictes the number of trnsmission word counts tht re not vlid. Indictes link filure count. Indictes the bytes received to other nodes in the sme cluster. Indictes the bytes trnsmitted to other nodes in the sme cluster. Indictes the commnds tht re received from other nodes in the sme cluster. Indictes the commnds tht re initited to other nodes in the sme cluster. Indictes the lost-of-signl count. Indictes the loss-of-synchroniztion count. Indictes the primitive sequence-protocol error count. Indictes the bytes received to other nodes in the other clusters. Indictes the bytes trnsmitted to other nodes in the other clusters. Indictes the commnds tht re received from other nodes in the other clusters. Indictes the commnds tht re initited to other nodes in the other clusters. Indictes the worldwide port nme for the node. Tble 56 describes the node informtion tht is reported for ech node. Tble 56. Sttistic collection for nodes Sttistic nme cluster_id cluster cpu cpu_core id node_id rb Description Indictes the nme of the cluster. Indictes the nme of the cluster. busy - Indictes the totl CPU verge core busy milliseconds since the node ws reset. This sttistic reports the mount of the time the processor spends polling, witing for work versus ctully doing work. This sttistic ccumultes from zero. comp - Indictes the totl CPU verge core busy milliseconds for compression process cores since the node ws reset. system - Indictes the totl CPU verge core busy milliseconds since the node ws reset. This sttistic reports the mount of the time the processor spends polling, witing for work versus ctully doing work. This sttistic ccumultes from zero. This is the sme informtion s the informtion provided with the cpu busy sttistic nd eventully replces the cpu busy sttistic. id - Indictes the CPU core ID. comp - Indictes the per-core CPU verge core busy milliseconds for compression process cores since node ws reset. system - Indictes the per-core CPU verge core busy milliseconds for system process cores since node ws reset. Indictes the nme of the node. Indictes the unique identifier for the node. Indictes the number of bytes received. Chpter 6. Dignosing problems 89

104 Tble 56. Sttistic collection for nodes (continued) Sttistic nme re ro rq wb we wo wq Description Indictes the ccumulted receive ltency, excluding inbound queue time. This sttistic is the ltency tht is experienced by the node communiction lyer from the time tht n I/O is queued to cche until the time tht the cche gives completion for it. Indictes the number of messges or bulk dt received. Indictes the ccumulted receive ltency, including inbound queue time. This sttistic is the ltency from the time tht commnd rrives t the node communiction lyer to the time tht the cche completes the commnd. Indictes the bytes sent. Indictes the ccumulted send ltency, excluding outbound queue time. This sttistic is the time from when the node communiction lyer issues messge out onto the Fibre Chnnel until the node communiction lyer receives notifiction tht the messge rrived. Indictes the number of messges or bulk dt sent. Indictes the ccumulted send ltency, including outbound queue time. This sttistic includes the entire time tht dt is sent. This time includes the time from when the node communiction lyer receives messge nd wits for resources, the time to send the messge to the remote node, nd the time tht is tken for the remote node to respond. Tble 57 describes the sttistics collection for volumes. Tble 57. Cche sttistics collection for volumes nd volume copies Sttistic Acronym Cche sttistics for volumes Cche sttistics for volume copies Cche prtition sttistics for volumes Cche prtition sttistics for volume copies Overll node cche sttistics Cche sttistics for mdisks Units nd stte red ios ri Yes Yes ios, cumultive write ios wi Yes Yes ios, cumultive red misses r Yes Yes sectors, cumultive red hits rh Yes Yes sectors, cumultive flush_through writes fst_write writes write_through writes ft Yes Yes sectors, cumultive fw Yes Yes sectors, cumultive wt Yes Yes sectors, cumultive write hits wh Yes Yes sectors, cumultive prefetches p Yes sectors, cumultive prefetch hits (prefetch dt tht is red) ph Yes sectors, cumultive Cche sttistics for dt reduction pools 90 SAN Volume Controller: Troubleshooting Guide

105 Tble 57. Cche sttistics collection for volumes nd volume copies (continued) Sttistic Acronym Cche sttistics for volumes prefetch misses (prefetch pges tht re discrded without ny sectors red) Cche sttistics for volume copies Cche prtition sttistics for volumes Cche prtition sttistics for volume copies Overll node cche sttistics Cche sttistics for mdisks Units nd stte pm Yes pges, cumultive modified dt m Yes Yes sectors, snpshot, noncumultive red nd write cche dt v Yes Yes sectors snpshot, noncumultive destges d Yes Yes sectors, cumultive fullness Averge fv Yes Yes %, noncumultive fullness Mx fmx Yes Yes %, noncumultive fullness Min fmn Yes Yes %, noncumultive Destge Trget Averge Destge Trget Mx Destge Trget Min Destge In Flight Averge Destge In Flight Mx Destge In Flight Min destge ltency verge destge ltency mx dtv Yes Yes IOs cpped 9999, noncumultive dtmx Yes IOs, noncumultive dtmn Yes IOs, noncumultive dfv Yes Yes IOs cpped 9999, noncumultive dfmx Yes IOs, noncumultive dfmn Yes IOs, noncumultive dv Yes Yes Yes Yes Yes Yes µs cpped , noncumultive dmx Yes Yes Yes µs cpped , noncumultive Cche sttistics for dt reduction pools Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Chpter 6. Dignosing problems 91

106 Tble 57. Cche sttistics collection for volumes nd volume copies (continued) Sttistic Acronym Cche sttistics for volumes destge ltency min Cche sttistics for volume copies Cche prtition sttistics for volumes Cche prtition sttistics for volume copies Overll node cche sttistics Cche sttistics for mdisks Units nd stte dmn Yes Yes Yes µs cpped , noncumultive destge count dcn Yes Yes Yes Yes Yes ios, noncumultive stge ltency verge stge ltency mx stge ltency min sv Yes Yes Yes µs cpped , noncumultive smx Yes µs cpped , noncumultive smn Yes µs cpped , noncumultive stge count scn Yes Yes Yes ios, noncumultive prestge ltency verge prestge ltency mx prestge ltency min prestge count Write Cche Fullness Averge Write Cche Fullness Mx Write Cche Fullness Min Red Cche Fullness Averge Red Cche Fullness Mx pv Yes Yes µs cpped , noncumultive pmx Yes µs cpped , noncumultive pmn Yes µs cpped , noncumultive pcn Yes Yes ios, noncumultive wfv Yes %, noncumultive wfmx Yes %, noncumultive wfmn Yes %, noncumultive rfv Yes %, noncumultive rfmx Yes %, noncumultive Cche sttistics for dt reduction pools Yes Yes 92 SAN Volume Controller: Troubleshooting Guide

107 Tble 57. Cche sttistics collection for volumes nd volume copies (continued) Sttistic Acronym Cche sttistics for volumes Red Cche Fullness Min Pinned Percent dt trnsfer ltency verge Trck Lock Ltency (Exclusive) Averge Trck Lock Ltency (Shred) Averge Cche I/O Control Block Queue Time Cche Trck Control Block Queue Time Owner Remote Credit Queue Time Non-Owner Remote Credit Queue Time Admin Remote Credit Queue Time Cdcb Queue Time Buffer Queue Time Hrdening Rights Queue Time Cche sttistics for volume copies Cche prtition sttistics for volumes Cche prtition sttistics for volume copies Overll node cche sttistics Cche sttistics for mdisks Units nd stte rfmn Yes %, noncumultive pp Yes Yes Yes Yes Yes % of totl cche snpshot, noncumultive tv Yes Yes µs cpped , noncumultive tev Yes Yes µs cpped , noncumultive tsv Yes Yes µs cpped , noncumultive hpt Yes Averge µs, noncumultive ppt Yes Averge µs, noncumultive opt Yes Averge µs, noncumultive npt Yes Averge µs, noncumultive pt Yes Averge µs, noncumultive cpt Yes Averge µs, noncumultive bpt Yes Averge µs, noncumultive hrpt Yes Averge µs, noncumultive Cche sttistics for dt reduction pools Yes Note: Any sttistic with nme v, mx, mn, nd cn is not cumultive. These sttistics reset every sttistics intervl. For exmple, if the sttistic does not hve nme with nme v, mx, mn, nd cn, nd it is n Ios or count, it will be field contining totl number. v The term pges mens in units of 4096 bytes per pge. v The term sectors mens in units of 512 bytes per sector. Chpter 6. Dignosing problems 93

108 v The term µs mens microseconds. v Non-cumultive mens totls since the previous sttistics collection intervl. v Snpshot mens the vlue t the end of the sttistics intervl (rther thn n verge cross the intervl or pek within the intervl). There re three types of dt reduction properties per dt reduction pool. v dc - these sttistics re relted to the dt stored within the dt reduction pool. v rc - these sttistics re relted to I/O to mnge the bckground grbge collection processes of the dt reduction pool. v jc - these sttistics re relted to journling opertions for the metdt tht mnges the dt reduction pool. Tble 58 describes the sttistic collection for volume cche per individul nodes. Tble 58. Sttistic collection for volume cche per individul nodes. This tble describes the volume cche informtion tht is reported for individul nodes. Sttistic nme cm ctd ctds ctp ctps ctrh ctrhp ctrhps ctrhs ctr ctrs ctwft ctwfts Description Indictes the number of sectors of modified or dirty dt tht re held in the cche. Indictes the totl number of cche destges tht were initited writes, submitted to other components s result of volume cche flush or destge opertion. Indictes the totl number of sectors tht re written for cche-initited trck writes. Indictes the number of trck stges tht re initited by the cche tht re prestge reds. Indictes the totl number of stged sectors tht re initited by the cche. Indictes the number of totl trck red-cche hits on prestge or non-prestge dt. For exmple, single red tht spns two trcks where only one of the trcks obtined totl cche hit, is counted s one trck red-cche hit. Indictes the number of trck reds received from other components, which re treted s cche hits on ny prestged dt. For exmple, if single red spns two trcks where only one of the trcks obtined totl cche hit on prestged dt, it is counted s one trck tht is red for the prestged dt. A cche hit tht obtins prtil hit on prestge nd non-prestge dt still contributes to this vlue. Indictes the totl number of sectors tht re red for reds received from other components tht obtined cche hits on ny prestged dt. Indictes the totl number of sectors tht re red for reds received from other components tht obtined totl cche hits on prestge or non-prestge dt. Indictes the totl number of trck reds received. For exmple, if single red spns two trcks, it is counted s two totl trck reds. Indictes the totl number of sectors tht re red for reds received. Indictes the number of trck writes received from other components nd processed in flush through write mode. Indictes the totl number of sectors tht re written for writes tht re received from other components nd processed in flush through write mode. 94 SAN Volume Controller: Troubleshooting Guide

109 Tble 58. Sttistic collection for volume cche per individul nodes (continued). This tble describes the volume cche informtion tht is reported for individul nodes. Sttistic nme ctwfw ctwfwsh ctwfwshs ctwfws ctwh ctwhs ctw ctws ctwwt ctwwts cv Description Indictes the number of trck writes received from other components nd processed in fst-write mode. Indictes the trck writes in fst-write mode tht were written in write-through mode becuse of the lck of memory. Indictes the trck writes in fst-write mode tht were written in write through due to the lck of memory. Indictes the totl number of sectors tht re written for writes tht re received from other components nd processed in fst-write mode. Indictes the number of trck writes received from other components where every sector in the trck obtined write hit on lredy dirty dt in the cche. For write to count s totl cche hit, the entire trck write dt must lredy be mrked in the write cche s dirty. Indictes the totl number of sectors tht re received from other components where every sector in the trck obtined write hit on lredy dirty dt in the cche. Indictes the totl number of trck writes received. For exmple, if single write spns two trcks, it is counted s two totl trck writes. Indictes the totl number of sectors tht re written for writes tht re received from components. Indictes the number of trck writes received from other components nd processed in write through write mode. Indictes the totl number of sectors tht re written for writes tht re received from other components nd processed in write through write mode. Indictes the number of sectors of red nd write cche dt tht is held in the cche. Tble 59 describes the XML sttistics specific to n IP Prtnership port. Tble 59. XML sttistics for n IP Prtnership port Sttistic nme ipbz iprc ipre iprt iprx ipsz iptc Description Indictes the verge size (in bytes) of dt tht is being submitted to the IP prtnership driver since the lst sttistics collection period. Indictes the totl bytes tht re received before ny decompression tkes plce. Indictes the bytes retrnsmitted to other nodes in other clusters by the IP prtnership driver. Indictes the verge round-trip time in microseconds for the IP prtnership link since the lst sttistics collection period. Indictes the bytes received from other nodes in other clusters by the IP prtnership driver. Indictes the verge size (in bytes) of dt tht is being trnsmitted by the IP prtnership driver since the lst sttistics collection period. Indictes the totl bytes tht re trnsmitted fter ny compression (if ctive) tkes plce. Chpter 6. Dignosing problems 95

110 Tble 59. XML sttistics for n IP Prtnership port (continued) Sttistic nme iptx Description Indictes the bytes trnsmitted to other nodes in other clusters by the IP prtnership driver. Tble 60 describes the offlod dt trnsfer (ODX) Vdisk nd node level I/O sttistics. Tble 60. ODX VDisk nd node level sttistics Sttistic nme Acronym Description Red cumultive ODX I/O ltency Write cumultive ODX I/O ltency Totl trnsferred ODX I/O red blocks Totl trnsferred ODX I/O write blocks orl owl oro owo Cumultive totl red ltency of ODX I/O per VDisk. The unit type is micro-seconds (US). Cumultive totl write ltency of ODX I/O per VDisk. The unit type is micro-seconds (US). Cumultive totl number of blocks tht re red nd successfully reported to the host, by ODX WUT commnd per VDisk. It is represented in blocks unit type. Cumultive totl number of blocks tht re written nd successfully reported to the host, by ODX WUT commnd per VDisk. It is represented in blocks unit type. Wsted ODX I/Os oiowp Cumultive totl number of wsted blocks tht re written by ODX WUT commnd per node. It is represented in blocks unit type. WUT filure count otrec Cumultive totl number of filed ODX WUT commnds per node. It includes WUT filures due to token revoction nd expirtion. Tble 61 describes the sttistics collection for cloud per cloud ccount id. Tble 61. Sttistics collection for cloud per cloud ccount id Sttistic nme Acronym Description id id Cloud ccount id Totl Successful Puts puts Totl number of successful PUT opertions 96 SAN Volume Controller: Troubleshooting Guide

111 Tble 61. Sttistics collection for cloud per cloud ccount id (continued) Sttistic nme Acronym Description Totl Successful Gets gets Totl number of successful GET opertions Bytes Up bup Totl number of bytes successful trnsferred to the cloud Bytes Down bdown Totl number of bytes successful downloded/red from the cloud Up Ltency uplt Totl time tht is tken to trnsfer the dt to the cloud Down Ltency dwlt Totl time tht is tken to downlod the dt from the cloud Down Error Ltency dwerlt Time tht is tken for the GET errors Prt Error Ltency pterlt Totl time tht is tken for prt errors In SAN Volume Controller it might lwys zero s no MPU scenrio gets triggered. Persisted Bytes Down prbdw Totl number of bytes successfully downloded from the cloud nd persisted on the locl storge tht were prt of successful GET opertion Persisted Bytes Up prbup Totl number of bytes successfully trnsferred to the cloud nd persisted on the cloud tht were prt of successful PUT opertion. The difference is tht you might hve 100 bytes file, of which you successfully hd 80 bytes sent to the cloud through PUT opertion, but the lst dt trnsfer cycle crrying 20 bytes errored out, nd the entire request filed. In tht cse, the sttistics indictes: BYTES_UP = 80 nd PERSISTED_BYTES_UP = 0 Persisted Down Ltency prdwlt Totl time tht is tken to downlod the dt from the cloud tht were prt of successful GET opertion Persisted Up Ltency pruplt Totl time tht is tken to trnsfer the dt to the cloud tht were prt of successful PUT opertion Filed Gets flgt Totl number of filed GET opertions Filed Puts flpt Totl number of filed PUT opertions Chpter 6. Dignosing problems 97

112 Tble 61. Sttistics collection for cloud per cloud ccount id (continued) Sttistic nme Acronym Description Get Errors gter Totl number of times red from the cloud filed (including the lst retry tht filed the GET request) Get Retries gtrt Totl number of GET retries Prt Errors pter Totl number of prt errors. It is the count if multi prt uplod occurs. The prt refers to the multi-prt uplod scenrio. In SAN Volume Controller, it lwys remins zero s MPU size is 32 MiB. SAN Volume Controller blob size rnges from few KBs to1 MiB. Prts Put ptpt Totl number of prts tht re successfully trnsferred to the cloud Persisted prts prpt Totl number prts successfully persisted on the cloud tht were prt of successful put opertion Put retries ptrt Totl number of PUT retries Throttle uplod ltency tuplt Averge dely introduced due to setting uplod bndwidth limit Throttle downlod ltency tdwlt Averge dely introduced due to setting downlod bndwidth limit Throttle uplod bndwidth utiliztion percentge Throttle downlod bndwidth utiliztion percentge tupbwpc tdwbwpc Bndwidth utiliztion in percentge of configured uplod bndwidth limit Bndwidth utiliztion in percentge of configured downlod bndwidth limit Tble 62 describes the sttistics collection for cloud per VDisk. Tble 62. Sttistics collection for cloud per VDisk SNo Sttistic nme Acronym Description 1 blocks up bup Number of blocks tht re uploded in cloud 2 blocks down bdn Number of block tht is downloded from cloud Note: A block is 512 bytes. Actions The following ctions re vilble to the user: 98 SAN Volume Controller: Troubleshooting Guide

113 OK Cncel Click this button to chnge sttistic collection. Click this button to exit the pnel without chnging sttistic collection. XML formtting informtion The XML is more complicted now, s seen in this rw XML from the volume (Nv_sttistics) sttistics. Notice how the nmes re similr but becuse they re in different section of the XML, they refer to different prt of the VDisk. <vdsk idx="0" ctrs=" " ctps="0" ctrhs=" " ctrhps="0" ctds=" " ctwfts="9635" ctwwts="0" ctwfws=" " ctwhs="9117" ctws=" " ctr=" " ctw=" " ctp="0" ctrh="123056" ctrhp="0" ctd=" " ctwft="200" ctwwt="0" ctwfw=" " ctwfwsh="0" ctwfwshs="0" ctwh="538" cm=" " cv=" " gwot="0" gwo="0" gws="0" gwl="0" id="mster_iogrp0_1" ro="0" wo="0" rb="0" wb="0" rl="0" wl="0" rlw="0" wlw="0" xl="0"> Vdisk/Volume sttistics <c r="0" rh="0" d="0" ft="0" wt="0" fw="0" wh="0" ri="0" wi="0" dv="0" dcn="0" pv="0" pcn="0" tev="0" tsv="0" tv="0" pp="0"/> <cpy idx="0"> volume copy sttistics <c r="0" p="0" rh="0" ph="0" d="0" ft="0" wt="0" fw="0" wh="0" pm="0" ri="0" wi="0" dv="0" dcn="0" sv="0" scn="0" pv="0" pcn="0" tev="0" tsv="0" tv="0" pp="0"/> </cpy> <vdsk> The <cpy idx="0"> mens its in the volume copy section of the VDisk, wheres the sttistics shown under Vdisk/Volume sttistics re outside of the cpy idx section nd therefore refer to VDisk/volume. Similrly for the volume cche sttistics for node nd prtitions: <uc><c dv="18726" dcn=" " dmx="749846" dmn="89" sv="20868" scn=" " smx="980941" smn="3" pv="0" pcn="0" pmx="0" pmn="0" wfv="0" wfmx="2" wfmn="0" rfv="0" rfmx="1" rfmn="0" pp="0" hpt="0" ppt="0" opt="0" npt="0" pt="0" cpt="0" bpt="0" hrpt="0" /><prtition id="0"><c dv="18726" dcn=" " dmx="749846" dmn="89" fv="0" fmx="2" fmn="0" dfv="0" dfmx="0" dfmn="0" dtv="0" dtmx="0" dtmn="0" pp="0"/></prtition> This output describes the volume cche node sttistics where <prtition id="0"> the sttistics re described for prtition 0. Cche sttistics for dt reduction pools nd volume copy cche sttistics nodes nd prtitions is: Chpter 6. Dignosing problems 99

114 Event reporting <lc><c dv="18726" dcn=" " dmx="749846" dmn="89" sv="20868" scn=" " smx="980941" smn="3" pv="0" pcn="0" pmx="0" pmn="0" wfv="0" wfmx="2" wfmn="0" rfv="0" rfmx="1" rfmn="0" pp="0" hpt="0" ppt="0" opt="0" npt="0" pt="0" cpt="0" bpt="0" hrpt="0" /> <dc p=" " rh="305754" ph="178873" d="0" ft="0" wt="0" fw="0" wh="0" v=" " m=" " pm="1120" ri="10720" wi="0" r=" " dv="0" dcn="0" sv="59926" scn="6045" pv="48350" pcn="2723" tev="0" tsv="0" tv="0" pp="0"/> <rc p=" " rh="305754" ph="178873" d="0" ft="0" wt="0" fw="0" wh="0" v=" " m=" " pm="1120" ri="10720" wi="0" r=" " dv="0" dcn="0" sv="59926" scn="6045" pv="48350" pcn="2723" tev="0" tsv="0" tv="0" pp="0"/> <jc p=" " rh="305754" ph="178873" d="0" ft="0" wt="0" fw="0" wh="0" v=" " m=" " pm="1120" ri="10720" wi="0" r=" " dv="0" dcn="0" sv="59926" scn="6045" pv="48350" pcn="2723" tev="0" tsv="0" tv="0" pp="0"/> </prtition> Events tht re detected re sved in n event log. As soon s n entry is mde in this event log, the condition is nlyzed. If ny service ctivity is required, notifiction is sent, if you set up notifictions. Event reporting process The following methods re used to identify new event: v If you enbled Simple Network Mngement Protocol (SNMP), n SNMP trp is sent to n SNMP mnger tht is configured by the customer. v If enbled, log messges cn be forwrded on n IP network by using the syslog protocol. v If enbled, event notifictions cn be forwrded by emil by using Simple Mil Trnsfer Protocol (SMTP). v Cll Home cn be enbled so tht criticl fults generte problem mngement record (PMR) tht is sent in n emil to the pproprite support center. Power-on self-test When you turn on the system, the system bord performs self-tests. During the initil tests, the hrdwre boot symbol is displyed. All models perform series of tests to check the opertion of components nd some of the options tht re instlled when the units re first turned on. This series of tests is clled the power-on self-test (POST). The node sttus LED is off until booting finishes nd the system softwre is loded If criticl filure is detected during the POST, the softwre is not loded nd the system error LED on the opertor informtion pnel is illuminted. If this filure occurs, use MAP 5000: Strt on pge 265 to help isolte the cuse of the filure. 100 SAN Volume Controller: Troubleshooting Guide

115 Understnding events When the softwre is loded, extr testing tkes plce, which ensures tht ll of the required hrdwre nd softwre components re instlled nd functioning correctly. When significnt chnge in sttus is detected, n event is logged in the event log. Error dt Events re clssified s either lerts or messges: v An lert is logged when the event requires some ction. Some lerts hve n ssocited error code tht defines the service ction tht is required. The service ctions re utomted through the fix procedures. If the lert does not hve n error code, the lert represents n unexpected chnge in stte. This sitution must be investigted to see whether it is expected or represents filure. Investigte n lert nd resolve it s soon s it is reported. v A messge is logged when chnge tht is expected is reported, including n IBM FlshCopy opertion completes. Mnging the event log The event log hs limited size. After it is full, newer entries replce entries tht re no longer required. To void hving repeted event tht fills the event log, some records in the event log refer to multiple occurrences of the sme event. When event log entries re colesced in this wy, the time stmp of the first occurrence nd the lst occurrence of the problem is sved in the log entry. A count of the number of times tht the error condition hs occurred is lso sved in the log entry. Other dt refers to the lst occurrence of the event. Viewing the event log You cn view the event log by using the mngement GUI or the commnd-line interfce (CLI). About this tsk You cn view the event log by using the Monitoring > Events options in the mngement GUI. The event log contins mny entries. You cn, however, select only the type of informtion tht you need. You cn lso view the event log by using the commnd-line interfce (lseventlog). See the Commnd-line interfce topic for the commnd detils. Describing the fields in the event log The event log includes fields with informtion tht you cn use to dignose problems. Tble 63 on pge 102 describes some of the fields tht re vilble to ssist you in dignosing problems. Chpter 6. Dignosing problems 101

116 Tble 63. Description of dt fields for the event log Dt field Event ID Description Sttus Description This number precisely identifies why the event ws logged. A short description of the event. Indictes whether the event requires some ttention. Alert: if red icon with cross is shown, follow the fix procedure or service ction to resolve the event nd turn the sttus green. Monitoring: the event is not yet of concern. Expired: the event no longer represents concern. Error code Sequence number Event count Object type Object ID Object nme Copy ID Reporting node ID Reporting node nme Fixed First time stmp Lst time stmp Root sequence number Sense dt Messge: provide useful informtion bout system ctivity. Indictes tht the event represents n error in the system tht cn be fixed by following the fix procedure or service ction tht is identified by the error code. Not ll events hve n error code. Different events hve the sme error code if the sme service ction is required for ech. Identifies the event within the system. The number of events tht re colesced into this event log record. The object type to which the event reltes. Uniquely identifies the object within the system to which the event reltes. The nme of the object in the system to which the event reltes. If the object is volume nd the event refers to specific copy of the volume, this field is the number of the copy to which the event reltes. Typiclly identifies the node responsible for the object to which the event reltes. For events tht relte to nodes, it identifies the node tht logged the event, which cn be different from the node tht is identified by the object ID. Typiclly identifies the node tht contins the object to which the event reltes. For events tht relte to nodes, it identifies the node tht logged the event, which cn be different from the node tht is identified by the object nme. Where n lert is shown for n error or wrning condition, it indictes tht the user mrked the event s fixed, completed the fix procedure, or tht the condition ws resolved utomticlly. For messge event, this field cn be used to cknowledge the messge. The time when this error event ws reported. If events of similr type re being colesced together, so tht one event log record represents more thn one event, this field is the time the first error event ws logged. The time when the lst instnce of this error event ws recorded into this event log record. If set, it is the sequence number of n event tht represents n error tht probbly cused this event to be reported. Resolve the root event first. Extr dt tht gives the detils of the condition tht cused the event to be logged. 102 SAN Volume Controller: Troubleshooting Guide

117 Event notifictions The system cn use Simple Network Mngement Protocol (SNMP) trps, syslog messges, nd cll home emils to notify you nd the support center when significnt events re detected. Any combintion of these notifiction methods cn be used simultneously. Notifictions re normlly sent immeditely fter n event is rised. However, there re some events tht might occur becuse of ctive service ctions. If recommended service ction is ctive, these events re notified only if they re still unfixed when the service ction completes. Ech event tht the system detects is ssigned notifiction type of Error, Wrning, or Informtion. When you configure notifictions, you specify where the notifictions should be sent nd which notifiction types re sent to tht recipient. Tble 64 describes the levels of event notifictions. Tble 64. Notifiction levels Notifiction level Error Description Error notifiction is sent to indicte problem tht must be corrected s soon s possible. Wrning Informtion This notifiction indictes serious problem with the system. For exmple, the event tht is being reported could indicte loss of redundncy in the system, nd it is possible tht nother filure could result in loss of ccess to dt. The most typicl reson tht this type of notifiction is sent is becuse of hrdwre filure, but some configurtion errors or fbric errors lso re included in this notifiction level. Error notifictions cn be configured to be sent s cll home messge to your support center. A wrning notifiction is sent to indicte problem or unexpected condition with the system. Alwys immeditely investigte this type of notifiction to determine the effect tht it might hve on your opertion, nd mke ny necessry corrections. A wrning notifiction does not require ny replcement prts nd therefore should not require involvement from your support center. The lloction of notifiction type Wrning does not imply tht the event is less serious thn one tht hs notifiction level Error. An informtionl notifiction is sent to indicte tht n expected event hs occurred: for exmple, FlshCopy opertion hs completed. No remedil ction is required when these notifictions re sent. Events with notifiction type Error or Wrning re shown s lerts in the event log. Events with notifiction type Informtion re shown s messges. SNMP trps Simple Network Mngement Protocol (SNMP) is stndrd protocol for mnging networks nd exchnging messges. The system cn send SNMP messges tht notify personnel bout n event. You cn use n SNMP mnger to view the SNMP messges tht the system sends. You cn use the mngement GUI or the commnd-line interfce to configure nd modify your SNMP settings. You cn specify up to mximum of six SNMP servers. Chpter 6. Dignosing problems 103

118 You cn use the Mngement Informtion Bse (MIB) file for SNMP to configure network mngement progrm to receive SNMP messges tht re sent by the system. This file cn be used with SNMP messges from ll versions of the softwre. More informtion bout the MIB file for SNMP is vilble t this website: Serch for the nme of your storge system, nd then serch for MIB file for SNMP. Go to the downlods results to find IBM Mngement Informtion Bse (MIB) file for SNMP. Click this link to find downlod options. Syslog messges The syslog protocol is stndrd protocol for forwrding log messges from sender to receiver on n IP network. The IP network cn be either IPv4 or IPv6. The system cn send syslog messges tht notify personnel bout n event. The system cn trnsmit syslog messges in either expnded or concise formt. You cn use syslog mnger to view the syslog messges tht the system sends. The system uses the User Dtgrm Protocol (UDP) to trnsmit the syslog messge. You cn specify up to mximum of six syslog servers.you cn use the mngement GUI or the commnd-line interfce to configure nd modify your syslog settings. Tble 65 shows how system notifiction codes mp to syslog security-level codes. Tble 65. System notifiction types nd corresponding syslog level codes System notifiction type Syslog level code Description ERROR LOG_ALERT Fult tht might require hrdwre replcement tht needs immedite ttention. WARNING LOG_ERROR Fult tht needs immedite ttention. Hrdwre replcement is not expected. INFORMATIONAL LOG_INFO Informtion messge used, for exmple, when configurtion chnge tkes plce or n opertion completes. TEST LOG_DEBUG Test messge Tble 66 shows how system vlues of user-defined messge origin identifiers mp to syslog fcility codes. Tble 66. System vlues of user-defined messge origin identifiers nd syslog fcility codes System vlue Syslog vlue Syslog fcility code Messge formt 0 16 LOG_LOCAL0 Full 1 17 LOG_LOCAL1 Full 2 18 LOG_LOCAL2 Full 3 19 LOG_LOCAL3 Full 4 20 LOG_LOCAL4 Concise 5 21 LOG_LOCAL5 Concise 6 22 LOG_LOCAL6 Concise 104 SAN Volume Controller: Troubleshooting Guide

119 Tble 66. System vlues of user-defined messge origin identifiers nd syslog fcility codes (continued) System vlue Syslog vlue Syslog fcility code Messge formt 7 23 LOG_LOCAL7 Concise Cll home emil The cll home function sends enhnced reports tht include opertionl nd event-relted dt nd specific configurtion informtion to the support center. When configured, this function lerts the support center bout hrdwre filures nd potentilly serious configurtion or environmentl issues. The support center cn use configurtion informtion to utomticlly generte best prctices or recommendtions tht re bsed on your ctul configurtion. To send emil, you must configure t lest one Simple Mil Trnsfer Protocol (SMTP) server. You cn specify s mny s 5 dditionl SMTP servers for bckup purposes. The SMTP server must ccept the relying of emil from the mngement IP ddress. Set the reply ddress to vlid emil ddress. Send test emil to check tht ll connections nd infrstructure re set up correctly. If you only wnt error nd inventory informtion sent to the support center you cn choose to hide sensitive entries, such s object nmes, cloud ccounts, network informtion, certifictes, host nd user informtion from the reports. Dt tht is sent with notifictions Notifictions cn be sent by using emil, SNMP, or syslog. The dt tht is sent for ech type of notifiction is the sme. It includes: v Record type v Mchine type v Mchine seril number v Error ID v Error code v Softwre version v FRU prt number v Cluster (system) nme v Node ID v Error sequence number v Time stmp v Object type v Object ID v Problem dt Emils contin the following dditionl informtion tht llows the Support Center to contct you: v Contct nmes for first nd second contcts v Contct phone numbers for first nd second contcts v Alternte contct numbers for first nd second contcts v Offshift phone number v Contct emil ddress v Mchine loction Chpter 6. Dignosing problems 105

120 Inventory informtion emil To send dt nd notifictions to service personnel, use one of the following emil ddresses: v For systems tht re locted in North Americ, Ltin Americ, South Americ or the Cribben Islnds, use cllhome0@de.ibm.com v For systems tht re locted nywhere else in the world, use cllhome1@de.ibm.com An inventory informtion emil summrizes the hrdwre components nd configurtion of system. Service personnel cn use this informtion to contct you when relevnt softwre updtes re vilble or when n issue tht cn ffect your configurtion is discovered. It is good prctice to enble inventory reporting. Becuse inventory informtion is sent using the cll home emil function, you must meet the cll home function requirements nd enble the cll home emil function before you cn ttempt to send inventory informtion emil. You cn djust the contct informtion, djust the frequency of inventory emil, or mnully send n inventory emil using the mngement GUI or the commnd-line interfce. The cll home function sends enhnced reports tht include specific configurtion informtion to the support center. The support center cn use this informtion to utomticlly generte recommendtions tht re bsed on your ctul configurtion. The inventory emil includes the following informtion bout the clustered system on which the cll home function is enbled. Sensitive informtion such s IP ddresses is not included. v Licensing informtion v Detils bout the following objects nd functions: Drives Externl storge systems Hosts MDisks Volumes Arry types nd RAID levels Esy Tier FlshCopy Metro Mirror nd Globl Mirror HyperSwp Exmple emil Figure 33 on pge 107 shows n exmple of n emil. For detils bout the specific informtion tht is included in the cll home inventory for your system, configure the system to send n inventory emil to yourself. The title of the emil includes the text Spectrum Virtulize Error Notifiction (cluster), where cluster is the nme of your system. 106 SAN Volume Controller: Troubleshooting Guide

121 # Timestmp = Fri Jul 10 11:47: # Timezone = +0100, WEST # Orgniztion = IBM # Mchine Address = Hursley Prk # Mchine City = Winchester # Mchine Stte = XX # Mchine Zip = SO212JN # Mchine Country = GB # Contct Nme = Mike Progrmmer # Alternte Contct Nme = N/A # Contct Phone Number = # Alternte Contct Phone Number = N/A # Offshift Phone Number = N/A # Alternte Offshift Phone Number = N/A # Contct Emil = mike.progrmmer@system.com # Mchine Loction = D0517 # Mchine Type = 0001S01 # Seril Number = 06HTPWW # Mchine Prt Number = # System Version = (build ) # Record Type = 6 # Frequency = 7 # Cluster Alis = 0x1c # IBM Customer Number = # IBM Component ID = SANVCNSW1 # IBM Country Code = 866 # Spectrum Virtulize Unique ID = 1c # Cluster_VPD: id: c nme:milkshke loction:locl prtnership: bndwidth: totl_mdisk_cpcity:0 spce_in_mdisk_grps:0 spce_llocted_to_vdisks:0.00mb totl_free_spce:0 sttistics_sttus:on sttistics_frequency:15 required_memory:65536 cluster_locle:en_us time_zone:13 Afric/Csblnc code_level: (build ) FC_port_speed:2Gb id_lis: c gm_link_tolernce:300 gm_inter_cluster_dely_simultion:0 gm_intr_cluster_dely_simultion:0 emil_reply:mike.progrmmer@system.com emil_contct:mike Progrmmer emil_contct_primry: emil_contct_lternte: emil_contct_loction:d0517 emil_stte:running inventory_mil_intervl:7 totl_vdiskcopy_cpcity:0.00mb totl_used_cpcity:0.00mb totl_overlloction:100 totl_vdisk_cpcity:0.00mb iscsi_uth_method:none uth_service_configured:no uth_service_enbled:no uth_service_pwd_set:no uth_service_cert_set:no reltionship_bndwidth_limit:25 gm_mx_host_dely:5 tier:ssd tier_cpcity:0.00mb tier_free_cpcity:0.00mb tier:enterprise Chpter 6. Dignosing problems 107

122 Exmple emil Figure 34 shows n exmple of the heder informtion tht is included in n emil. For detils bout the specific informtion tht is included in the cll home inventory for your system, configure your system to send n inventory emil to yourself. # Timestmp = Mon Jul 10 19:35: # Timezone = +0000, GMT # Orgniztion = ibm # Mchine Address = hursley # Mchine City = winchester # Mchine Stte = XX # Mchine Zip = MEH # Mchine Country = GB # Contct Nme = LONG ATL CLUSTER # Alternte Contct Nme = N/A # Contct Phone Number = # Alternte Contct Phone Number = N/A # Offshift Phone Number = N/A # Alternte Offshift Phone Number = N/A # Contct Emil = LONG-ATL-CLUSTER@meh.meh.meh # Mchine Loction = THIS IS A TEST SYSTEM # Mchine Type = C # Seril Number = # Mchine Prt Number = # System Version = (build ) # Record Type = 6 # Frequency = 7..(Additionl system informtion follows). Figure 34. Exmple of inventory informtion emil Understnding the error codes Error codes re generted by the event-log nlysis nd system configurtion code. Error codes help you to identify the cuse of problem, filing component, nd the service ctions tht might be needed to solve the problem. Note: If more thn one error occurs during n opertion, the highest priority error code is displyed on the front pnel. The lower the number for the error code, the higher the priority. For exmple, error code 1020 hs higher priority thn error code Using the error code tbles The error code tbles list the vrious error codes nd describe the ctions tht you cn tke. About this tsk Complete the following steps to use the error code tbles: Procedure 1. Locte the error code in one of the tbles. If you cnnot find prticulr code in ny tble, cll IBM Support Center for ssistnce. 108 SAN Volume Controller: Troubleshooting Guide

123 2. Red bout the ction you must complete to correct the problem. Do not exchnge field replceble units (FRUs) unless you re instructed to do so. 3. Normlly, exchnge only one FRU t time, strting from the top of the FRU list for tht error code. Event IDs The system softwre genertes events, such s informtionl events nd error events. An event ID or number is ssocited with the event nd indictes the reson for the event. Informtionl events provide informtion bout the sttus of n opertion. Informtionl events re recorded in the event log, nd, depending on the configurtion, informtionl event notifictions cn be sent through emil, SNMP, or syslog. Error events re generted when service ction is required. An error event mps to n lert with n ssocited error code. Depending on the configurtion, error event notifictions cn be sent through emil, SNMP, or syslog. Informtionl events The informtionl events provide informtion bout the sttus of n opertion. Informtionl events re recorded in the event log nd, bsed on notifiction type, cn generte notifictions through emil, SNMP, or syslog. Informtionl events re distinguished from error events, which re ssocited with error codes nd might require service procedures. For list of error events, see Error event IDs nd error codes on pge 117. Informtionl events cn be either notifiction type I (informtion) or notifiction type W (wrning). An informtionl event report of type (W) might require user ttention. Tble 67 provides list of informtionl events, the notifiction type, nd the reson for the event. Tble 67. Informtionl events Event ID Notifiction type Description I Type conversion completed nd the originl copy hs been deleted I Bttery protection unvilble I Bttery protection temporrily unvilble; one bttery is expected to be vilble soon I Bttery protection temporrily unvilble; both btteries re expected to be vilble soon I Bttery cpcity is reduced becuse of cell imblnce I The error log is clered I The SSH key ws discrded for the service login user I User nme hs chnged I Degrded or offline mnged disk is now online I A degrded or offline storge pool is now online I Offline volume is now online. Chpter 6. Dignosing problems 109

124 Tble 67. Informtionl events (continued) Event ID Notifiction type Description W Volume is offline becuse of degrded or offline storge pool I All nodes cn see the port I A node hs been successfully dded to the cluster (system) I The node is now functionl member of the cluster (system) I A noncriticl hrdwre error occurred I Attempt to utomticlly recover offline node strting I Both nodes in the I/O group re vilble I One node in the I/O group is unvilble W Both nodes in the I/O group re unvilble I Cluster (system) recovery completed W Filed to obtin directory listing from remote node W Filed to trnsfer file from remote node I The migrtion is complete I The secure delete is complete W The virtuliztion mount is close to the limit tht is licensed W The FlshCopy feture is close to the limit tht is licensed W The Metro Mirror or Globl Mirror feture is close to the limit tht is licensed I Fibre Chnnel discovery occurred; configurtion chnges re pending I Fibre Chnnel discovery occurred; configurtion chnges re complete I Fibre Chnnel discovery occurred; no configurtion chnges were detected W The mnged disk is not on the preferred pth W The initiliztion for the mnged disk filed W The LUN discovery hs filed. The cluster (system) hs connection to device through this node but this node cnnot discover the unmnged or mnged disk tht is ssocited with this LUN W The LUN cpcity equls or exceeds the mximum. Only prt of the disk cn be ccessed W The mnged disk error count wrning threshold hs been met I Mnged disk offline imminent, offline prevention strted I Drive firmwre downlod completed successfully I Drive FPGA downlod completed successfully 110 SAN Volume Controller: Troubleshooting Guide

125 Tble 67. Informtionl events (continued) Event ID Notifiction type Description I Drive firmwre downlod strted I Drive FPGA downlod strted I Drive firmwre downlod cncelled by user I SAS discovery occurred; no configurtion chnges were detected I SAS discovery occurred; configurtion chnges re pending I SAS discovery occurred; configurtion chnges re complete W The LUN cpcity equls or exceeds the mximum cpcity. Only the first 1 PB of disk will be ccessed I The drive formt hs strted I The drive recovery ws strted I iscsi discovery occurred, configurtion chnges pending I iscsi discovery occurred, configurtion chnges complete I iscsi discovery occurred, no configurtion chnges were detected W Insufficient virtul extents W The migrtion suspended becuse of insufficient virtul extents or too mny medi errors on the source mnged disk W Migrtion hs stopped I Migrtion is complete W Copied disk I/O medium error I The FlshCopy opertion is prepred I The FlshCopy opertion is complete W The FlshCopy opertion hs stopped W First customer dt being pinned in volume working set I All customer dt in volume working set is now unpinned W The volume working set cche mode is in the process of chnging to synchronous destge becuse the volume working set hs too much pinned dt I Volume working set cche mode updted to llow synchronous destge becuse enough customer dt hs been unpinned for the volume working set I The debug from n IERR ws extrcted to disk I An ttempt ws mde to power on the slots I All the expnders on the strnd were reset I The component firmwre updte pused to llow the bttery chrging to finish. Chpter 6. Dignosing problems 111

126 Tble 67. Informtionl events (continued) Event ID Notifiction type Description I The updte for the component firmwre pused becuse the system ws put into mintennce mode I A component firmwre updte is needed but is prevented from running I The Metro Mirror or Globl Mirror bckground copy is complete I The Metro Mirror or Globl Mirror is redy to restrt W Unble to find pth to disk in the remote cluster (system) within the timeout period W The thin-provisioned volume copy dt in node is pinned I All thin-provisioned volume copy dt in node is unpinned I The thin-provisioned volume copy import hs filed nd the new volume is offline; either updte the system softwre to the required version or delete the volume I The thin-provisioned volume copy import is successful W A thin-provisioned volume copy spce wrning hs occurred I A thin-provisioned volume copy repir hs strted I A thin-provisioned volume copy repir is successful I A thin-provisioned volume copy vlidtion is strted I A thin-provisioned volume copy vlidtion is successful I The import of the compressed-virtul volume copy ws successful W A compressed-virtul volume copy spce wrning hs occurred I A compressed-virtul volume copy repir hs strted I A compressed-virtul volume copy repir is successful I A compressed-virtul volume copy hs too mny bd blocks I A medium error hs been repired for the mirrored copy W A mirror copy repir, using the vlidte option cnnot complete I A mirror disk repir is complete nd no differences re found I A mirror disk repir is complete nd the differences re resolved W A mirror disk repir is complete nd the differences re mrked s medium errors I The mirror disk repir hs been strted. 112 SAN Volume Controller: Troubleshooting Guide

127 Tble 67. Informtionl events (continued) Event ID Notifiction type Description W A mirror copy repir, using the set medium error option, cnnot complete W A mirror copy repir, using the resync option, cnnot complete W Node coldstrted W A node power-off hs been requested from the power switch I Additionl Fibre Chnnel ports were connected I Additionl ethernet ports connected I Additionl fibre chnnel IO ports connected W The connection to configured remote cluster (system) hs been lost W The node unexpectedly lost power but hs now been restored to the cluster (system) I The rebuild for n rry MDisk ws strted. Performnce my be ffected, wit for rebuild to complete I The rebuild for n rry MDisk hs finished I Arry vlidtion strted I Arry vlidtion complete W An overnight mintennce procedure hs filed to complete. Resolve ny hrdwre nd configurtion problems tht you re experiencing on the cluster (system). If the problem persists, contct your support representtive for ssistnce W An rry MDisk is offline becuse it hs too mny missing members I A RAID rry hs strted exchnging n rry member I A RAID rry hs completed exchnging n rry member I A RAID rry needs resynchroniztion I A filed drive hs been re-seted or replced. The system hs utomticlly configured the device I Distributed rry MDisk rebuild strted I Distributed rry MDisk rebuild completed I Distributed rry MDisk copybck strted I Distributed rry MDisk copybck completed I Distributed rry MDisk initiliztion strted I Distributed rry MDisk initiliztion completed I Distributed rry MDisk needs resynchroniztion W A storge pool spce wrning hs occurred. Chpter 6. Dignosing problems 113

128 Tble 68. SCSI sttus SCSI event reporting Nodes cn notify their hosts of events for SCSI commnds tht re issued. SCSI sttus Some events re prt of the SCSI rchitecture nd re hndled by the host ppliction or device drivers without reporting n event. Some events, such s red nd write I/O events nd events tht re ssocited with the loss of nodes or loss of ccess to bckend devices, cuse ppliction I/O to fil. To help troubleshoot these events, SCSI commnds re returned with the Check Condition sttus nd 32-bit event identifier is included with the sense informtion. The identifier reltes to specific event in the event log. If the host ppliction or device driver cptures nd stores this informtion, you cn relte the ppliction filure to the event log. Tble 68 describes the SCSI sttus nd codes tht re returned by the nodes. Sttus Code Description Good 00h The commnd ws successful. Check condition 02h The commnd filed nd sense dt is vilble. Condition met 04h N/A Busy 08h An Auto-Contingent Allegince condition exists nd the commnd specified NACA=0. Intermedite 10h N/A Intermedite - condition met 14h N/A Reservtion conflict 18h Returned s specified in SPC2 nd SAM-2 where reserve or persistent reserve condition exists. Tsk set full 28h The inititor hs t lest one tsk queued for tht LUN on this port. ACA ctive 30h This code is reported s specified in SAM-2. Tsk borted 40h This code is returned if TAS is set in the control mode pge 0Ch. The node hs defult setting of TAS=0, which cnnot be chnged; therefore, the node does not report this sttus. SCSI sense Nodes notify the hosts of events on SCSI commnds. Tble 69 defines the SCSI sense keys, codes, nd qulifiers tht re returned by the nodes. Tble 69. SCSI sense keys, codes, nd qulifiers Key Code Qulifier Definition Description 2h 04h 01h Not Redy. The logicl unit is in the process of becoming redy. The node lost sight of the system nd cnnot perform I/O opertions. The dditionl sense does not hve dditionl informtion. 114 SAN Volume Controller: Troubleshooting Guide

129 Tble 69. SCSI sense keys, codes, nd qulifiers (continued) Key Code Qulifier Definition Description 2h 04h 0Ch Not Redy. The trget port is in the stte of unvilble. The following conditions re possible: v The node lost sight of the system nd cnnot perform I/O opertions. The dditionl sense does not hve dditionl informtion. v The node is in contct with the system but cnnot perform I/O opertions to the specified logicl unit becuse of either loss of connectivity to the bckend controller or some lgorithmic problem. This sense is returned for offline volumes. 3h 00h 00h Medium event This is only returned for red or write I/Os. The I/O suffered n event t specific LBA within its scope. The loction of the event is reported within the sense dt. The dditionl sense lso includes reson code tht reltes the event to the corresponding event log entry. For exmple, RAID controller event or migrted medium event. 4h 08h 00h Hrdwre event. A commnd to logicl unit communiction filure hs occurred. 5h 25h 00h Illegl request. The logicl unit is not supported. The I/O suffered n event tht is ssocited with n I/O event tht is returned by RAID controller. The dditionl sense includes reson code tht points to the sense dt tht is returned by the controller. This is only returned for I/O type commnds. This event is lso returned from FlshCopy trget volumes in the prepred nd prepring stte. The logicl unit does not exist or is not mpped to the sender of the commnd. Tble 70. Reson codes Reson code (deciml) Reson codes The reson code ppers in bytes of the sense dt. The reson code provides the node with specific log entry. The field is 32-bit unsigned number tht is presented with the most significnt byte first. Tble 70 lists the reson codes nd their definitions. If the reson code is not listed in Tble 70, the code refers to specific event in the event log tht corresponds to the sequence number of the relevnt event log entry. Description 40 The resource is prt of stopped FlshCopy mpping. 50 The resource is prt of Metro Mirror or Globl Mirror reltionship nd the secondry LUN in the offline. 51 The resource is prt of Metro Mirror or Globl Mirror nd the secondry LUN is red only. Chpter 6. Dignosing problems 115

130 Tble 70. Reson codes (continued) Reson code (deciml) Description 60 The node is offline. 71 The resource is not bound to ny domin. 72 The resource is bound to domin tht ws recreted. 73 Running on node tht is contrcted out for some reson tht is not ttributble to ny pth tht is going offline. 80 Wit for the repir to complete, or delete the volume. 81 Wit for the vlidtion to complete, or delete the volume. 82 An offline thin-provisioned volume tht cused dt to be pinned in the directory cche. Adequte performnce cnnot be chieved for other thin-provisioned volumes, so they re tken offline. 85 The volume tht is tken offline becuse checkpointing to the quorum disk filed. 86 The repirvdiskcopy -medium commnd tht creted virtul medium error where the copies differed. 93 An offline RAID-5 or RAID-6 rry tht cused in-flight-write dt to be pinned. Good performnce cnnot be chieved for other rrys nd so they re tken offline. 94 An rry MDisk tht is prt of the volume tht is tken offline becuse checkpointing to the quorum disk filed. 95 This reson code is used in MDisk bd block dump files to indicte tht the dt loss ws cused by hving to resync prity with rebuilding strips or some other RAID lgorithm reson due to multiple filures. 96 A RAID-6 rry MDisk tht is prt of the volume tht is tken offline becuse n internl metdt tble is full. Object types You cn use the object code to determine the type of the object the event is logged ginst. Tble 71 lists the object codes nd corresponding object types. Tble 71. Object types Object code Object type 1 mdisk 2 mdiskgrp 3 volume 4 node 5 host 7 iogroup 8 fcgrp 9 rcgrp 10 fcmp 11 rcmp 12 wwpn 13 cluster (system) 16 device 116 SAN Volume Controller: Troubleshooting Guide

131 Tble 71. Object types (continued) Object code Object type 17 SCSI lun 18 quorum 34 Fibre Chnnel dpter 38 volume copy 39 Syslog server 40 SNMP server 41 Emil server 42 User group 44 Cluster (mngement) IP 46 SAS dpter Error event IDs nd error codes Error codes describe service procedure tht must be followed. Ech event ID tht requires service hs n ssocited error code. Note: Service procedures tht involve field-replceble units (FRUs) do not pply to softwre-bsed products, such s IBM Spectrum Virtulize.For informtion bout possible user ctions tht relte to FRU replcements, refer to your hrdwre mnufcturer's documenttion. Error codes cn be either notifiction type E (error) or notifiction type W (wrning). Tble 72 lists the event IDs tht hve corresponding error codes, nd shows the error code, the notifiction type, nd the condition for ech event. For list of informtionl events, which do not hve ssocited error codes, see Informtionl events on pge 109. The 07nnnn event ID rnge refers to node errors tht were logged by the system. The lst 3 digits represent the error tht ws reported by the node. You cn find these codes in the list of error codes t the end of this topic. Tble 72. Error event IDs nd error codes Event ID Notifiction type Condition E A system recovery hs run. All configurtion commnds re blocked. Error code E The error event log is full W The following cuses re possible: v The node is missing. v The node is no longer functionl member of the system E A node hs been missing for 30 minutes W Node hs been shut down W The softwre instll process hs filed W Softwre instll pckge cnnot be delivered to ll nodes Chpter 6. Dignosing problems 117

132 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition Softwre instll process stlled due to lck of redundncy Softwre downgrde process stlled due to lck of redundncy W Unble to connect to the SMTP (emil) server W Unble to send mil through the SMTP (emil) server. Error code W Remote Copy feture cpcity is not set W The FlshCopy feture cpcity is not set W The Virtuliztion feture hs exceeded the mount tht is licensed W The FlshCopy feture hs exceeded the mount tht is licensed W Remote Copy feture license limit exceeded W Thin-provisioned volume usge not licensed W The vlue set for the virtuliztion feture cpcity is not vlid E A physicl disk FlshCopy feture license is required E A physicl disk Metro Mirror nd Globl Mirror feture license is required E A virtuliztion feture license is required E Automtic recovery of offline node filed W Unble to send emil to ny of the configured emil servers W The externl virtuliztion feture license limit ws exceeded W Unble to connect to LDAP server W The LDAP configurtion is not vlid E The limit for the compression feture license ws exceeded E The limit for the compression feture license ws exceeded E Unble to connect to LDAP server tht hs been utomticlly configured E Invlid LDAP configurtion for utomticlly configured server W A licensble feture's tril-timer hs reched 0. The feture hs now been dectivted W A tril of licensble feture will expire in 5 dys W A tril of licensble feture will expire in 10 dys SAN Volume Controller: Troubleshooting Guide

133 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition W A tril of licensble feture will expire in 15 dys W A tril of licensble feture will expire in 45 dys. Error code W Esy Tier feture license limit exceeded W FlshCopy feture license limit exceeded W Externl virtuliztion feture license limit exceeded W Remote copy feture license limit exceeded W System updte completion is required W System updte completion hs stlled W Encryption feture license limit exceeded W The quorum ppliction is out of dte nd needs to be redeployed W System SSL certificte will expire within the next 30 dys W System SSL certificte hs expired W No ctive quorum device found on this cluster E The node rn out of bse event sources. As result, the node hs stopped nd exited the system W The number of device logins hs reduced W Device excluded due to excessive errors on ll Mnged Disks E Access beyond end of disk, or Mnged Disk missing E The block size is invlid, the cpcity or LUN identity hs chnged during the mnged disk initiliztion E The mnged disk is excluded becuse of excessive errors E The remote port is excluded for mnged disk nd node E The locl port is excluded E The login is excluded E Timeout due to non-responsive device E Timeout due to lost commnd E A timeout hs occurred s result of excessive processing time E An error recovery procedure hs occurred E A mnged disk is reporting excessive errors Chpter 6. Dignosing problems 119

134 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E The mnged disk error count threshold hs exceeded W There re too mny devices presented to the system W There re too mny mnged disks presented to the system W There re too mny LUNs presented to node W There re too mny drives presented to system. Error code W A disk I/O medium error hs occurred W A suitble MDisk or drive for use s quorum disk ws not found W The quorum disk is not vilble W A controller configurtion is not supported E A login trnsport fult hs occurred E A mnged disk error recovery procedure (ERP) hs occurred. The node or controller reported the following: v Sense v Key v Code v Qulifier E One or more MDisks on controller re degrded W The controller configurtion limits filover E The controller configurtion uses the RDAC mode; this is not supported W Persistent unsupported controller configurtion W Controller hs quorum disbled, but quorum disk is configured E The controller system device is only connected to the node through single inititor port E The controller system device is only connected to the node through single trget port E The controller system device is only connected to the nodes through single trget port E The controller system device is only connected to the nodes through hlf of the expected trget ports E The controller system device hs disconnected ll trget ports to the nodes SAN Volume Controller: Troubleshooting Guide

135 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition W Number of Device pths from the controller site llowed ccessible nodes hs reduced A Solid stte drive is missing from the configurtion Error code W An unrecognized SAS device E SAS error counts exceeded the wrning thresholds E SAS errors exceeded criticl thresholds W Controller indictes tht it does not support descriptor sense for LUNs tht re greter thn 2 TBs W Too mny enclosures were presented to system W Too mny controller trget ports were presented to the system W Too mny trget ports were presented to the system from single controller W There re too mny drives presented to system W Incorrect connection detected to port E Too mny long IOs to drive E A drive is reported s continuously slow with contributory fctors E Too mny long IOs to drive (Mercury drives) E A drive is reported s continuously slow with contributory fctors (Mercury drives) W Storge system connected to unsupported port E Drive reporting too mny t10dif errors W Encrypting MDisk is no longer encrypted W Drive firmwre downlod cnceled becuse of system chnges W Drive firmwre downlod cnceled becuse of drive downlod problem W A disk controller is not ccessible from node llowed to ccess the device by site policy W Too mny drives ttched to the system W Drive dt integrity error W A member drive hs been forced to turn off protection informtion support E Drive exchnge required W Performnce of externl MDisk hs chnged W iscsi session excluded Chpter 6. Dignosing problems 121

136 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition W A Flsh drive is expected to fil within six months due to limited write endurnce W A Flsh drive with high write endurnce usge rte E There re too mny medium errors on the MDisk. Error code 1215 on pge on pge E A storge pool is offline W There re insufficient virtul extents E Storge optimiztion services disbled E The MDisk hs bd blocks W The system filed to crete bd block becuse MDisk lredy hs the mximum number of llowed bd blocks W The system filed to crete bd block becuse the system lredy hs the mximum number of llowed bd blocks W FlshCopy prepre filed due to cche flush filure W FlshCopy hs been stopped due to the error indicted in the dt W Unrecovered FlshCopy mppings W SAS cble is not working t full cpcity E An ttempt to utomticlly configure reseted or replced drive hs filed W Drives re single ported due to spre node E Enclosure secondry expnder module hs filed E Enclosure secondry expnder module FRU identity is not vlid E Enclosure secondry expnder module temperture sensor cnnot be red E Enclosure secondry expnder module temperture hs pssed wrning threshold E Enclosure secondry expnder module temperture hs pssed criticl threshold 1267 on pge on pge on pge on pge on pge E Enclosure disply pnel is not instlled 1268 on pge E Enclosure disply pnel temperture sensor cnnot be red E Enclosure disply pnel temperture hs pssed wrning threshold E Enclosure disply pnel temperture hs pssed criticl threshold E Enclosure secondry expnder module connector excluded due to too mny chnge events 1268 on pge on pge on pge on pge SAN Volume Controller: Troubleshooting Guide

137 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition Error code E Enclosure disply pnel VPD cnnot be red 1268 on pge E Enclosure secondry expnder module is missing E Enclosure secondry expnder module connector excluded due to dropped frmes E Enclosure secondry expnder module connector is excluded nd cnnot be unexcluded E Enclosure secondry expnder module connectors excluded s the cuse of single ported drives E Enclosure secondry expnder module lef expnder connector excluded s the cuse of single ported drives W The Metro Mirror or Globl Mirror reltionship cnnot be recovered W A Metro Mirror or Globl Mirror reltionship or consistency group exists within system, but its prtnership hs been deleted W A Globl Mirror reltionship hs stopped becuse of persistent I/O error W A remote copy hs stopped becuse of persistent I/O error W Remote Copy reltionship or consistency groups lost synchroniztion W There re too mny system prtnerships. The number of prtnerships hs been reduced W There re too mny system prtnerships. The system hs been excluded W Bckground copy process for the remote copy ws blocked on pge on pge on pge on pge on pge W Prtner cluster IP ddress unrechble W Cnnot uthenticte with prtner cluster W Unexpected cluster ID for prtner cluster E The Globl Mirror secondry volume is offline. The reltionship hs pinned hrdened write dt for this volume E The Globl Mirror secondry volume is offline due to missing I/O group prtner node. The reltionship hs pinned hrdened write dt for this volume but the node contining the required dt is currently offline Chpter 6. Dignosing problems 123

138 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E Globl Mirror performnce is likely to be impcted. A lrge mount of pinned dt for the offline volumes hs reduced the resource vilble to the globl mirror secondry disks W HyperSwp volume hs lost synchroniztion between sites W HyperSwp consistency group hs lost synchroniztion between sites. Error code E Compression hs stopped unexpectedly W A thin-provisioned volume copy is offline becuse of insufficient spce E A thin-provisioned volume copy is offline becuse of corrupt metdt E A thin-provisioned volume copy is offline becuse of filed repir W A compressed volume copy is offline becuse of insufficient spce E A compressed volume copy is offline becuse of corrupt metdt E A compressed volume copy is offline becuse of filed repir E A compressed volume copy hs bd blocks W System is unble to mirror medium error E Mirrored volume is offline becuse it cnnot synchronize dt W Repir of mirrored volume stopped becuse of difference W A host port hs more thn four logins to node E Unrecognized node error E Detected memory size does not mtch the expected memory size E DIMMs re incorrectly instlled E The WWNN tht is stored on the service controller nd the WWNN tht is stored on the drive do not mtch E Unble to detect ny Fibre Chnnel dpter E The system bord processor hs filed E The internl disk file system of the node is dmged E Unble to updte BIOS settings E Unble to updte the service processor firmwre for the system bord E The mbient temperture is too high while the system is strting SAN Volume Controller: Troubleshooting Guide

139 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E System bord fult E A system bord device breched criticl temperture threshold E A PCI Riser breched criticl temperture threshold. Error code E Multiple hrdwre filures E A processor hs filed E No usble persistent dt could be found on the boot drives E The boot drives do not belong in this node E Boot drive nd system bord mismtch E Pluggble TPM is missing or broken W Cnnot form system due to lck of resources W Cnnot form cluster due to lck of cluster resources, overridequorum possible E Duplicte WWNN detected on the SAN E A node is unble to communicte with other nodes E Bttery cbling fult E Bttery bckplne or cbling fult E The node hrdwre does not meet minimum requirements E Too mny softwre filures E The internl drive of the node is filing E CPU temperture breched criticl threshold E Bttery protection temporrily unvilble; both btteries re expected to be vilble soon E Node softwre inconsistent E The node softwre is dmged E The system dt cnnot be red E The system dt ws not sved when power ws lost E Bttery subsystem hs insufficient chrge to sve system dt E Unble to red the service controller ID E UPS bttery fult E UPS bttery fult E UPS electronics fult E UPS output lod high E UPS electronics fult 1171 Chpter 6. Dignosing problems 125

140 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E UPS AC input power fult E Incorrect type of uninterruptible power supply detected. Error code E UPS configurtion error E UPS mbient temperture threshold exceeded E UPS fult W Insufficient uninterruptible power supply chrge to llow node to strt W Node held in service stte W Fibre Chnnel dpter missing E Fibre Chnnel dpter filed E Fibre Chnnel dpter PCI error E Fibre Chnnel dpter degrded W Fewer Fibre Chnnel ports opertionl W Fewer Fibre Chnnel I/O ports opertionl W Fibre Chnnel clustered system pth filure W A high speed SAS dpter is missing E SAS dpter filed E SAS dpter PCI error E SAS dpter degrded W Fewer SAS ports opertionl W SAS ports degrded W SASA port hs unsupported SAS device W Ethernet dpter missing E Ethernet dpter filed E Ethernet dpter PCI error E Ethernet dpter degrded W Fewer Ethernet ports Bus dpter missing Bus dpter filed Bus dpter PCI error Bus dpter degrded Fewer bus ports opertionl E A system bord device breched wrning temperture threshold E A power supply breched temperture threshold E A PCI Riser breched wrning temperture threshold E Boot drive missing or out of sync or filed SAN Volume Controller: Troubleshooting Guide

141 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition W A boot drive is in the wrong loction W Boot drive in unsupported slot W Technicin port connection is not vlid W Technicin connected E Voltge fult E Voltge high E Voltge low E Fn error E CMOS bttery hs filed W Ambient temperture wrning W CPU temperture wrning W Shutdown temperture reched E Power supply hs problem W Power supply mins cble is unplugged E Power supply is missing W Bttery is missing E Bttery hs filed W Bttery is below the minimum operting temperture W Bttery is bove the mximum operting temperture. Error code E Bttery hs communictions error W Bttery is nering end of life E Bttery VPD hs checksum error E Bttery is t hrdwre revision level not supported by the current code level W Encryption key required W Encryption key invlid W Encryption key not found W USB device (such s hub) unsupported W Encryption key required W Detected hrdwre is not vlid configurtion W Detected hrdwre needs ctivtion W Fibre Chnnel IO port mpping filed W Fibre-chnnel network fbric is too big W Incorrect enclosure E Incorrect slot E No enclosure id nd cnnot get sttus from prtner E Incorrect enclosure type 1192 Chpter 6. Dignosing problems 127

142 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E No enclosure id & prtner mtches E No enclosure id nd prtner hs cluster dt does not mtch E No enclosure id nd no cluster stte on prtner Error code E No enclosure id nd no cluster stte W Cluster id different between enclosure nd node E Cnnot red enclosure identity E The detected memory size does not mtch the expected memory size E The system bord processor hs filed E Internl disk file system is dmged E Unble to updte BIOS settings E Unble to updte system bord service processor firmwre W Ambient temperture too high while system strting E Cnister internl PCIe switch filed E Multiple hrdwre filures E Pluggble TPM is missing or broken W Cnnot form cluster due to lck of cluster resources W Cnnot form cluster due to lck of cluster resources, overridequorum possible W Duplicte WWNN detected on SAN E The node's hrdwre configurtion does not meet minimum requirements W Too mny softwre filures E The node's internl drive is filing E CPU over temp E Node softwre inconsistent E Node softwre is dmged E Cluster stte nd configurtion dt cnnot be red E Stte dt ws not sved on power loss W The vilble bttery chrge is not enough to llow the node cnister to strt. Two btteries re chrging W The vilble bttery chrge is not enough to llow the node cnister to strt. One bttery is chrging SAN Volume Controller: Troubleshooting Guide

143 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E The vilble bttery chrge is not enough to llow the node cnister to strt. No btteries re chrging. Error code W Node held in service stte W Fibre Chnnel dpter missing E Fibre Chnnel dpter filed E Fibre Chnnel dpter PCI error E Fibre Chnnel dpter degrded W Fewer Fibre Chnnel ports opertionl W Fewer Fibre Chnnel I/O ports opertionl W Fibre Chnnel clustered system pth filure W SAS dpter missing E SAS dpter filed E SAS dpter PCI error E SAS dpter degrded W Fewer SAS ports opertionl W SAS ports degrded W SASA port hs unsupported SAS device W Ethernet dpter missing E Ethernet dpter filed E Ethernet dpter PCI error E Ethernet dpter degrded W Fewer Ethernet ports W Bus dpter missing E Bus dpter filed E Bus dpter PCI error E Bus dpter degrded W Fewer bus ports opertionl W Technicin connected E CMOS error W Ambient temperture wrning W CPU temperture wrning W Bttery cold W Bttery hot E Bttery VPD checksum W Encryption key required W Encryption key invlid W Encryption key not found W USB device (such s hub) unsupported W Encryption key required 1328 Chpter 6. Dignosing problems 129

144 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition W Cnister bttery is nering end of life E CMOS bttery hs filure E CMOS bttery hs filure E CMOS bttery hs filure System bord hs more or fewer processors detected System bord hs more or fewer processors detected System bord hs more or lesss processors detected. Error code W Incorrect enclosure E Incorrect slot E No enclosure id nd cnnot get sttus from prtner E Incorrect enclosure type E No enclosure id & prtner mtches E No enclosure id nd prtner hs cluster dt tht does not mtch E No enclosure id nd no cluster stte on prtner E No enclosure id nd no cluster stte W Cluster id different between enclosure nd node E Cnnot red enclosure identity E The detected memory size does not mtch the expected memory size E The system bord processor hs filed E Internl disk file system is dmged E Unble to updte system bord service processor firmwre E Cnister internl PCIe switch filed E Multiple hrdwre filures W Cnnot form cluster due to lck of cluster resources W Cnnot form cluster due to lck of cluster resources, overridequorum possible E Duplicte WWNN detected on SAN E The node's hrdwre configurtion does not meet minimum requirements E Too mny softwre filures E The node's internl drive is filing E CPU over temp E Node softwre inconsistent SAN Volume Controller: Troubleshooting Guide

145 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E Node softwre is dmged E Cluster stte nd configurtion dt cnnot be red Error code E Stte dt ws not sved on power loss W The cnister bttery is not supported W The cnister bttery is missing E The cnister bttery hs filed E Cnister bttery communictions error W Cnister bttery hs insufficient chrge to support firehose dump W Node held in service stte W Fibre Chnnel dpter missing E Fibre Chnnel dpter filed E Fibre Chnnel dpter PCI error E Fibre Chnnel dpter degrded W Fewer Fibre Chnnel ports opertionl W Fewer Fibre Chnnel I/O ports opertionl W Fibre Chnnel clustered system pth filure W SAS dpter missing E SAS dpter filed E SAS dpter PCI error E SAS dpter degrded W Fewer SAS ports opertionl W SAS ports degrded W SASA port hs unsupported SAS device W Ethernet dpter missing E Ethernet dpter filed E Ethernet dpter PCI error E Ethernet dpter degrded W Fewer Ethernet ports W Bus dpter missing E Bus dpter filed E Bus dpter PCI error E Bus dpter degrded W Fewer bus ports opertionl E CMOS error W A hrdwre chnge ws mde tht is not supported by softwre. User ction is required to repir the hrdwre or updte the softwre Chpter 6. Dignosing problems 131

146 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition W A supported hrdwre chnge ws mde to this node. User ction is required to ctivte the new hrdwre. Error code W Cnister bttery is nering end of life W Fibre-chnnel network fbric is too big W The Fibre Chnnel ports re not opertionl E Fibre Chnnel dpter detected PCI bus error E System pth hs filure W The SAN is not correctly zoned. As result, more thn 512 ports on the SAN hve logged into one system port E More or less Fibre Chnnel dpters re detected E Fibre Chnnel dpter is fulty E Fibre Chnnel dpter hs detected PCI bus error E More or less Fibre Chnnel dpters re detected E Fibre Chnnel dpter is fulty E Fibre Chnnel dpter hs detected PCI bus error E More or less Fibre Chnnel dpters re detected E Fibre Chnnel dpter is fulty E Fibre Chnnel dpter hs detected PCI bus error W Fibre Chnnel speed hs chnged E Duplicte Fibre Chnnel frme is detected E The Fibre Chnnel dpter hs filure E Fibre Chnnel dpter hs detected PCI bus error W Incorrect enclosure E Enclosure VPD is inconsistent E The system bord service processor hs filed E Unble to updte BIOS settings E Ambient temperture is too high during system strtup E Multiple hrdwre filures W Cnnot form cluster due to lck of cluster resources, overridequorum possible W Too mny softwre filures E CPU over temp SAN Volume Controller: Troubleshooting Guide

147 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E Cluster stte nd configurtion dt cnnot be red Error code W The cnister bttery is not supported W Node held in service stte W Fewer SAS ports opertionl W SAS ports degrded W SASA port hs unsupported SAS device E CMOS error W The node cnister hs detected tht it hs hrdwre type tht is not comptible with the control enclosure MTM W Encryption key required W Encryption key invlid W Encryption key not found W USB device (such s hub) unsupported W Encryption key required W Cnister bttery is nering end of life W System is unble to determine VPD for FRU E The node wrm strted fter softwre error W A connection to configured remote system hs been lost becuse of connectivity problem W A connection to configured remote system hs been lost becuse of too mny minor errors W Incorrect enclosure E Incorrect slot E No enclosure id nd cnnot get sttus from prtner E Incorrect enclosure type E No enclosure id & prtner mtches E No enclosure id nd prtner hs cluster dt tht does not mtch E No enclosure id nd no cluster stte on prtner E No enclosure id nd no cluster stte W Cluster id different between enclosure nd node E Cnnot red enclosure identity E The detected memory size does not mtch the expected memory size 1039 Chpter 6. Dignosing problems 133

148 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E Enclosure VPD is inconsistent E Unble to detect ny fibre-chnnel dpter E The system bord processor hs filed E Internl disk file system is dmged E Unble to updte BIOS settings E Unble to updte system bord service processor firmwre W Ambient temperture too high while system strting Error code E System bord fult E Cnister internl PCIe switch filed E A device on the system bord is too hot E PCI Riser too hot E Multiple hrdwre filures W Cnnot form cluster due to lck of cluster resources W Cnnot form cluster due to lck of cluster resources, overridequorum possible W Duplicte WWNN detected on SAN E The node's hrdwre configurtion does not meet minimum requirements E Too mny softwre filures E The node's internl drive is filing E CPU over temp E Node softwre inconsistent E Node softwre is dmged E Cluster stte nd configurtion dt cnnot be red E Stte dt ws not sved on power loss W The cnister bttery is not supported W The cnister bttery is missing E The cnister bttery hs filed W The cnister bttery is below minimum operting temperture W The cnister bttery is bove mximum operting temperture E Cnister bttery communictions error W Cnister bttery hs insufficient chrge to support fire hose dump E Not enough bttery to support grceful shutdown W Node held in service stte SAN Volume Controller: Troubleshooting Guide

149 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition W SAS dpter missing E SAS dpter filed E SAS dpter PCI error E SAS dpter degrded W Fewer SAS ports opertionl W SAS ports degrded W SASA port hs unsupported SAS device W Ethernet dpter missing E Ethernet dpter filed E Ethernet dpter PCI error E Ethernet dpter degrded W Fewer Ethernet ports W Bus dpter missing E Bus dpter filed E Bus dpter PCI error E Bus dpter degrded W Fewer bus ports opertionl W Ambient temperture wrning W Encryption key required W Encryption key invlid W Encryption key not found W USB device (such s hub) unsupported W A hrdwre chnge ws mde tht is not supported by softwre. User ction is required to repir the hrdwre or updte the softwre W A supported hrdwre chnge ws mde to this node. User ction is required to ctivte the new hrdwre. Error code E Flsh boot device hs filure E Flsh boot device hs recovered E Service controller hs red filure E Flsh boot device hs filure E Flsh boot device hs recovered E Service controller hs red filure E Flsh boot device hs filure E Flsh boot device hs recovered E A service controller red filure occurred E The internl disk for node hs filed E The hrd disk is full nd cnnot cpture ny more output Chpter 6. Dignosing problems 135

150 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E One of the two power supply units in the node hs filed E One of the two power supply units in the node cnnot be detected E One of the two power supply units in the node is without power. Error code E A high-speed SAS dpter is missing E The PCIe lnes on high-speed SAS dpter re degrded E A PCI bus error occurred on high-speed SAS dpter E A high-speed SAS dpter requires PCI bus reset E The SAS dpter hs n internl fult E The node service processor indicted fn filure E The node service processor indicted fn filure E The node service processor indicted fn filure E Node mbient temperture threshold hs exceeded E The node processor indicted temperture wrning E The node service processor or mbient criticl temperture threshold hs exceeded E Node mbient temperture threshold hs exceeded E Node processor temperture hs wrning E Node processor or mbient criticl temperture threshold hs exceeded E System bord voltge is high E System bord voltge is high E System bord voltge is high E System bord voltge is low E System bord voltge is low E System bord voltge is low E Power mngement bord hs voltge fult E Node mbient temperture threshold hs exceeded E Temperture wrning threshold exceeded E Temperture criticl threshold exceeded SAN Volume Controller: Troubleshooting Guide

151 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E Power mngement bord voltge hs fult E Power domin error. Both nodes in the I/O group re powered by the sme UPS W The limit on the number of system secure shell (SSH) sessions hs been reched W Unble to ccess the Network Time Protocol (NTP) network time server W Unble to connect to NTP server tht hs been utomticlly configured W Hrdwre configurtions of nodes differ in n I/O group W Stretch cluster reconfigurtion is required to restore dul site configurtion Error code I Technicin port connection is not ctive I Technicin port connection is ctive W Performnce not optimised for V9000 vrints without mnged enclosures W Performnce not optimised for V9000 vrints with mnged enclosures E Ethernet interfce hs filure E A server error hs occurred W Service filure hs occured E System filed to communicte with UPS E UPS output loding ws unexpectedly high E Bttery hs reched end of life E UPS bttery hs fult E UPS electronics hs fult E UPS frme hs fult E UPS is overcurrent E UPS hs fult but no specific FRU is identified E The UPS hs detected n input power fult E UPS hs cbling error E UPS mbient temperture threshold hs exceeded E UPS mbient temperture is high E UPS crossed-cble test is bypssed becuse of n internl UPS softwre error E System filed to communicte with UPS E UPS output loding ws unexpectedly high E Bttery hs reched end of life E UPS hs bttery fult Chpter 6. Dignosing problems 137

152 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E UPS hs n electronics fult E UPS is overcurrent E UPS hs fult but no specific FRU is identified. Error code E The UPS hs detected n input power fult E UPS hs cbling error E UPS mbient temperture threshold hs exceeded E UPS mbient temperture is high E UPS crossed-cble test is bypssed becuse of n internl UPS softwre error W An rry MDisk hs deconfigured members nd hs lost redundncy W An rry MDisk is expected to fil within six months due to limited write endurnce of member drives E An rry MDisk is corrupt becuse of lost metdt W Arry MDisk hs tken spre member tht does not mtch rry gols W An rry hs members tht re locted in different I/O group W An rry MDisk is no longer protected by n pproprite number of suitble spres W No spre protection exists for one or more rry MDisks W Distributed rry MDisk hs fewer rebuild res vilble thn threshold W A bckground scrub process hs found n inconsistency between dt nd prity on the rry W Arry MDisk hs been forced to disble hrdwre dt integrity checking on member drives E An rry MDisk is offline. The metdt for the inflight writes is on missing node E An rry MDisk is offline. Metdt on the missing node contins needed stte informtion on pge W Arry response time too high W Distributed rry MDisk member slow write count threshold exceeded E Distributed rry MDisk offline due to I/O timeout W Bttery reconditioning required but not possible SAN Volume Controller: Troubleshooting Guide

153 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition E Interfce crd hs degrded PCI link W Externl FC dt link degrded W Externl IB dt link degrded E Cnister is missing n interfce crd W Externl iscsi port not opertionl W Too mny ISCSI host logins W System updte hlted W Check the ir filter E Arry dt compromised W Too mny enclosures visible on fbric W Enclosure visible on fbric mnged by nother system W Cbling error. Internl cbling connectivity hs chnged W Enclosure connectivity undetermined. Connectivity to n enclosure cn no longer be determined Error code W Miniml enclosure connectivity not met W Config node cnnot communicte with cnister W Mnged enclosure is not visible from config node W Cnister internl error I Successful write to USB Flsh Drive n/ W Write filed to USB Flsh Drive E Encryption key unvilble W Encryption key on USB flsh drive removed W Write to USB Flsh Drive filure I Write to USB Flsh Drive successful n/ W Encryption not committed E Key Server reported KMIP error 1785 on pge E Key Server reported vendor informtion error 1785 on pge E Filed to connect to Key Server 1785 on pge W Key Server reported misconfigured primry 1785 on pge E Cloud gtewy service restrted 2031 on pge E Cloud gtewy service restrted too often 1404 on pge 204 Chpter 6. Dignosing problems 139

154 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition W Cloud ccount SSL certificte will expire within the next 30 dys W Cloud ccount not vilble, cnnot resolve hostnme W Cloud ccount not vilble, cnnot contct cloud provider W Cloud ccount not vilble, cnnot communicte with cloud provider W Cloud ccount not vilble, no mtching CA certificte W Cloud ccount not vilble, no mtching CA certificte W Cloud ccount not vilble, cnnot estblish secure connection with cloud provider W Cloud ccount not vilble, cnnot uthenticte with cloud provider W Cloud ccount not vilble, cnnot obtin permission to use cloud storge W Cloud ccount not vilble, cnnot complete cloud storge opertion W Cloud ccount not vilble, cnnot ccess cloud object storge W Cloud ccount not vilble, incomptible object dt formt W Cloud ccount not vilble, cloud object storge encrypted W Cloud ccount not vilble, cloud object storge not encrypted W Cloud ccount not vilble, cloud object storge encrypted with the wrong key W No permission to use cloud storge snpshot opertion W Cloud ccount out of spce during cloud storge snpshot opertion W Cnnot crete continer object to cloud object storge during cloud snpshot opertion W A cloud object could not be found during cloud snpshot opertion W A cloud object ws found to be corrupt during cloud snpshot opertion W A cloud object ws found to be corrupt during cloud snpshot decompression opertion W Etg integrity error during cloud snpshot opertion Error code 3140 on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge SAN Volume Controller: Troubleshooting Guide

155 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition W Internl Red error during cloud snpshot opertion W Unexpected error occurred, cnnot complete cloud snpshot opertion W No permission to use cloud snpshot restore opertion W A cloud object could not be found during cloud snpshot restore opertion W A cloud object ws found to be corrupt during cloud snpshot restore opertion W A cloud object ws found to be corrupt during cloud snpshot restore decompression opertion W Etg integrity error during cloud snpshot restore opertion W Internl write error during cloud snpshot opertion W Cnnot crete bd blocks on mnged disk during cloud snpshot restore opertion W Unexpected error occurred, cnnot complete cloud snpshot restore opertion W No permission to use cloud snpshot delete opertion W A cloud object could not be found during cloud snpshot delete opertion W A cloud object ws found to be corrupt during cloud snpshot delete opertion W A cloud object ws found to be corrupt during cloud snpshot delete decompression opertion W Unexpected error occurred, cnnot complete cloud snpshot delete opertion W Cloud ccount out of spce during cloud snpshot restore commit opertion W Cloud ccount out of spce during cloud snpshot delete opertion W Trnsprent Cloud Tiering feture license limit exceeded W Too mny node restrts hve occurred, cloud bckup opertions pused W Internl FlshCopy error on volume enbled for cloud snpshots. Error code 2120 on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge on pge E An IO port cnnot be strted E A fibrechnnel trget port mode trnsition ws not successful W Equivlent fibre chnnel ports re reporting tht they re connected to different fbrics Chpter 6. Dignosing problems 141

156 Tble 72. Error event IDs nd error codes (continued) Event ID Notifiction type Condition W A spre node in this cluster is not providing dditionl redundncy W A spre node could not be utomticlly removed from the cluster Error code W Single PSU filure in bre metl server W Node IP missing, only single pth connection vilble between nodes W The IP connections between nodes were broken W A node rejoined cluster with identity chnged Resolving problem with the SAN Volume Controller boot drives Complete the following steps to resolve most problems with SAN Volume Controller boot drives. Before you begin The node seril number (lso known s the product or mchine seril number) is on the MT-M S/N lbel (Mchine Type - Model nd Seril Number lbel) on the front (left side) of the node. The node seril number is written to the system bord nd to ech of the two boot drives during the mnufcturing process. When the SAN Volume Controller softwre strts, it reds the node seril number from the system bord (by using the node seril number for the pnel nme) nd compres it with the node seril numbers tht re stored on the two boot drives. Specific node errors re produced under the following conditions: v Unrecoverble node error 543: This error indictes tht none of the node seril numbers tht re stored in the three loctions mtch. The node seril number from the system bord must mtch with t lest one of the two boot drives for the SAN Volume Controller softwre to ssume tht node seril number is good. v Unrecoverble node error 545: This error indictes tht the node seril numbers on ech boot drive mtch ech other but re not the sme s the node seril number from the system bord. In this cse, the node seril number on the system bord might be wrong or the node seril number on the boot drives might be wrong. For exmple, the system bord tht is chnged or the boot drives come from nother node. v Node error 743: This error indictes tht the node seril number cnnot be red from one of the two boot drives becuse tht drive filed, is missing, or is out of sync with the other boot drive. v Node error 744: This error indictes tht the node seril number from one of the boot drives identifies s belonging to different node. If boot drives were swpped between drive slots 1 nd 2, node error 744 is produced. v Node error 745: This error indictes tht boot drive is found in n unsupported slot. This error occurs when t lest one of the first two drives is online nd t lest one invlid slot (3-8) is occupied. 142 SAN Volume Controller: Troubleshooting Guide

157 About this tsk An event is displyed in the Monitoring > Events pnel of the mngement GUI if the problem produces node error 743, 744, or 745. Run the fix procedure for tht event. Otherwise, connect to the technicin port to use the MT-M S/N lbel on the node to see the boot drive slot informtion nd determine the problem. Attention: If drive slot hs Yes in the Active column, the operting system depends on tht drive. Do not remove tht drive without first shutting down the node. v Do not swp boot drives between slots. v Ech boot drive hs copy of the VPD on the system bord. v Softwre upgrding is to one boot drive t time to prevent filures during CCU. Procedure To resolve problem with boot drive, complete the following steps in order: 1. Remove ny drive tht is in n unsupported slot. Move the drive to the correct slot if you cn. 2. If possible, replce ny drive tht is shown s missing from slot. Otherwise, reset the drive or replce it with drive from FRU stock. 3. Move ny drive tht is in the wrong node bck to the correct node. Note: If the node seril number does not mtch the node seril number on the system bord, drive slot hs sttus of wrong_node. If the seril number on the MT-M S/N lbel mtches the node seril number on the drive, you cn ignore this sttus. 4. Move ny drive tht is in the wrong slot bck to the correct slot. 5. Reset the drive in ny slot tht hs sttus of filed. If the sttus remins filed, replce the drive with one from FRU stock. 6. If the drive slot hs sttus out of sync nd Yes in the cn_sync column, then: v v v Use the service ssistnt GUI to synchronize boot drives, or Use the commnd-line interfce (CLI) commnd stsk chbootdrive -sync. If No is displyed in the cn_sync column, you must resolve nother boot drive problem first. Replcing the system bord: 7. Replce the SAN Volume Controller 2145-DH8 or SAN Volume Controller 2145-SV1 min bord. When neither of the boot drives hve usble SAN Volume Controller softwre: For exmple, if you replce both of the boot drives from FRU stock t the sme time, neither boot drive hs usble SAN Volume Controller softwre. If the SAN Volume Controller softwre is not running, the node sttus, node fult, bttery sttus, nd bttery fult LEDs remin off. 8. If you cnnot replce t lest one of the originl boot drives with drive tht contins usble SAN Volume Controller softwre nd hs node seril number tht mtches the MT-M S/N lbel on the front of the node, contct IBM Remote Technicl support. IBM Remote Technicl support cn help you instll the SAN Volume Controller softwre with bootble USB flsh drive. Chpter 6. Dignosing problems 143

158 v v Field-bsed USB instlltion lso repirs the node seril number nd WWNN stored on ech boot drive by finding vlues tht re stored on the system bord during mnufcturing. If the WWNN of this node tht is chnged in the pst, you must chnge the WWNN gin fter you complete the SAN Volume Controller softwre instlltion. For exmple, if the node replced n erlier SAN Volume Controller node, you must chnge the WWNN to tht of the erlier node. You cn repet the chnge to the WWNN fter the SAN Volume Controller softwre instlltion with the service ssistnt GUI or by commnd. When every copy of the node seril number is lost: For exmple, if you replce the system bord nd both of the boot drives with FRU stock t the sme time, every copy of the node seril number is lost. 9. If you cnnot replce one of the originl boot drives or the originl system bord so tht t lest one copy of the originl node seril number is present, you cnnot repir the node in the field. Return the node to IBM for repir. Results The sttus of drive slot is uninitilized only if the SAN Volume Controller softwre might not utomticlly initilize the FRU drive. This sttus cn hppen if the node seril number on the other boot drive does not mtch the node seril number on the system bord. If the node seril number on the other boot drive mtches the MT-M S/N lbel on the front tht is left of the node, you cn rescue the uninitilized boot drive from the other boot drive sfely. Use the service ssistnt GUI or the stsk rescuenode commnd to rescue the drive. Resolving problem with filure to boot Light pth LEDs might indicte hrdwre filure on SAN Volume Controller 2145-DH8. SAN Volume Controller 2145-SV1 does not hve light pth LEDs, but it does hve some dignostic LEDs.Dignostic LEDs might indicte hrdwre filure on SAN Volume Controller 2145-SV1. Before you begin If the SAN Volume Controller softwre is not running, then the node sttus nd bttery sttus LEDs re off. The service interfces such s the technicin port nd sttsk.txt on USB flsh drive do not work. Note: The SAN Volume Controller 2145-SV1 node fult LED cn flsh when wrning or criticl error shows in the BMC event log (SEL). The wrning or criticl error prevents the SAN Volume Controller code from booting. If the SAN Volume Controller softwre is running, then the node error LED might be on. The node error code nd error dt cn be seen by connecting to the technicin port or by using the other service interfces. Look up the node error code in the IBM SAN Volume Controller Knowledge Center. About this tsk Complete the following steps if the SAN Volume Controller softwre is not running. 144 SAN Volume Controller: Troubleshooting Guide

159 Procedure 1. Connect monitor to the VGA port nd keybord to USB port. Consider ny error messges on the monitor. For exmple, ws it unble to find device from which to boot? (Check tht the SAS cbles between the boot drives nd the min system bord re connected correctly.) 2. If no useful messges disply on the monitor, complete the following steps.. Power off the system by using the power button. b. Disconnect the power cbles. c. Wit for 1 minute. d. Reconnect the power cbles. The node ttempts to power on. e. If the power LED comes on green, then wtch the VGA monitor for ny useful messges. 3. Attempt to ccess the UEFI setup utility on the VGA monitor by powering off, nd then powering on by using the power button while you hold down the ESC or Delete key for the SAN Volume Controller 2145-SV1, or the F1 key for the SAN Volume Controller 2145-DH8. If the Setup Utility displys, complete the following steps.. If the node fult LED is flshing, ccess the Bmc self test log from the Server Mgmt tb to look for cuse. b. Access the System Event Log from the Server Mgmt tb. Events in this log might help to pinpoint the problem. 4. If by using the setup utility you re unble to pinpoint broken component, or if the setup utility does not strt, complete the following steps. It is best to initilly investigte fult with the DIMMs.. Power off the system by using the power button. b. Disconnect the power cbles. c. Remove the DIMMs but leve in one DIMM per microprocessor (CPU). For exmple, leve the DIMM in the first DIMM slot of ech CPU. d. Reconnect the power cbles. The node ttempts to power on. e. If the SAN Volume Controller softwre now boots nd the node fult LED comes on, then one of the DIMMs tht you removed might be broken. Repet the steps with different DIMM until you find the broken DIMM. f. If ll the DIMMs work when only one is fitted per CPU, then refit the DIMMs. 5. If the SAN Volume Controller softwre does not lod with ll of the tested good DIMMs fitted, complete the following steps. It is best to investigte fult with the CPUs before you consider replcing the system bord.. Power off the system by using the power button. b. Disconnect the power cbles. c. Remove the CPU lbeled s CPU 1 on the system bord. d. Reconnect the power cbles. The node ttempts to power on. e. Replce the CPU with CPU from FRU stock. If the SAN Volume Controller softwre now boots nd the node fult LED comes on, then the CPU tht you removed might be broken. f. If the SAN Volume Controller softwre does not boot, swp the CPUs. g. Replce the CPU with CPU from FRU stock. If the SAN Volume Controller softwre now boots nd the node fult LED comes on, then the CPU tht you removed might be broken. Chpter 6. Dignosing problems 145

160 6. If you did not find ny evidence of broken DIMM or CPU, contct IBM Remote Technicl Support. They might wnt to know the stte of the SAN Volume Controller 2145-SV1 system bord LEDs. Node error code overview Node error codes describe filures tht relte to specific node. Connect to the technicin port so tht you cn use the service ssistnt GUI to view node errors nd other error dt. Becuse node errors re specific to node, for exmple, memory filures, the errors might be reported only on tht node. However, if the node cn communicte with the configurtion node, then it is reported in the system event log. When the node error code indictes tht criticl error ws detected tht prevents the node from becoming member of clustered system, the Node fult LED is on. The following exmple shows node error: Node Error The dditionl dt is unique for ny error code. It provides the necessry informtion to isolte the problem in n offline environment. Exmples of extr dt re disk seril numbers nd field replceble unit (FRU) loction codes. For more informtion, refer to the help for the specific three-digit node error. Node errors cn be divided into criticl node errors nd noncriticl node errors. Criticl errors A criticl error mens tht the node is not ble to prticipte in clustered system until the issue tht is preventing it from joining clustered system is resolved. This error occurs becuse prt of the hrdwre fils or the system detects tht the softwre is corrupted. If node hs criticl node error, it is in service stte, nd the fult LED on the node is on. The exception is when the node cnnot connect to enough resources to form clustered system. It shows criticl node error but is in the strting stte. Resolve the errors in priority order. The rnge of criticl errors is Noncriticl errors A noncriticl error code is logged when hrdwre or code filure is relted to one specific node. These errors do not stop the node from entering ctive stte nd joining clustered system. If the node is prt of clustered system, n lert describes the error condition. The rnge of errors tht re reserved for noncriticl errors re Error code rnge This topic shows the number rnge for ech messge clssifiction. Tble 73 on pge 147 lists the number rnge for ech messge clssifiction. 146 SAN Volume Controller: Troubleshooting Guide

161 Tble 73. Messge clssifiction number rnge Messge clssifiction Booting codes (no longer used) Node errors Error codes when creting clustered system (no longer used) Error codes when recovering clustered system (no longer used) Error codes for clustered system Rnge Node rescue errors (no longer used) Log-only node errors (no longer used) Criticl node errors Noncriticl node errors , , Boot is running Explntion: The node hs strted. It is running dignostics nd loding the runtime code. Go to the hrdwre boot MAP to resolve the problem. 120 Disk drive hrdwre error Explntion: The internl disk drive of the node hs reported n error. The node is unble to strt. Ensure tht the boot disk drive nd ll relted cbling is properly connected, then exchnge the FRU for new FRU. 130 Checking the internl disk file system Explntion: The file system on the internl disk drive of the node is being checked for inconsistencies. If the progress br hs been stopped for t lest five minutes, power off the node nd then power on the node. If the boot process stops gin t this point, run the node rescue procedure. v None. 132 Updting BIOS settings of the node Explntion: The system hs found tht chnges re required to the BIOS settings of the node. These chnges re being mde. The node will restrt once the chnges re complete. If the progress br hs stopped for more thn 10 minutes, or if the disply hs shown codes 100 nd 132 three times or more, go to Resolving problem with filure to boot on pge 144 to resolve the problem. 135 Verifying the softwre Explntion: The softwre pckges of the node re being checked for integrity. complete. Allow the verifiction process to 137 Updting system bord service processor firmwre Explntion: The service processor firmwre of the node is being updted to new level. This process cn tke 90 minutes. Do not restrt the node while this is in progress. complete. Allow the updting process to 150 Loding cluster code Explntion: The system code is being loded. If the progress br hs been stopped for t lest 90 seconds, power off the node nd then power on the node. If the boot process stops gin t this point, run the node rescue procedure. v None. 155 Loding cluster dt Chpter 6. Dignosing problems 147

162 Explntion: The sved cluster stte nd cche dt is being loded. If the progress br hs been stopped for t lest 5 minutes, power off the node nd then power on the node. If the boot process stops gin t this point, run the node rescue procedure. v None. 168 The commnd cnnot be initited becuse uthentiction credentils for the current SSH session hve expired. Explntion: Authentiction credentils for the current SSH session hve expired, nd ll uthoriztion for the current session hs been revoked. A system dministrtor my hve clered the uthentiction cche. the commnd. Begin new SSH session nd re-issue 170 A flsh module hrdwre error hs occurred. Explntion: occurred. A flsh module hrdwre error hs Exchnge the FRU for new FRU. 182 Checking uninterruptible power supply Explntion: The node is checking whether the uninterruptible power supply is operting correctly. complete. Allow the checking process to 232 Checking uninterruptible power supply connections Explntion: The node is checking whether the power nd signl cble connections to the uninterruptible power supply re correct. complete. Allow the checking process to 300 The 2145 is running node rescue. Explntion: The 2145 is running node rescue. If the progress br hs been stopped for t lest two minutes, exchnge the FRU for new FRU. 310 The 2145 is running formt opertion. Explntion: The 2145 is running formt opertion. If the progress br hs been stopped for two minutes, exchnge the FRU for new FRU. 320 A 2145 formt opertion hs filed. Explntion: A 2145 formt opertion hs filed. Exchnge the FRU for new FRU. 330 The 2145 is prtitioning its disk drive. Explntion: The 2145 is prtitioning its disk drive. If the progress br hs been stopped for two minutes, exchnge the FRU for new FRU. 340 The 2145 is serching for donor node. Explntion: The 2145 is serching for donor node. If the progress br hs been stopped for more thn two minutes, exchnge the FRU for new FRU. v Fibre Chnnel dpter (100%) 345 The 2145 is serching for donor node from which to copy the softwre. Explntion: donor node. The node is serching t 1 Gb/s for If the progress br hs stopped for more thn two minutes, exchnge the FRU for new FRU. v Fibre Chnnel dpter (100%) 350 The 2145 cnnot find donor node. Explntion: The 2145 cnnot find donor node. If the progress br hs stopped for more thn two minutes, perform the following steps: 1. Ensure tht ll of the Fibre Chnnel cbles re connected correctly nd securely to the cluster. 2. Ensure tht t lest one other node is opertionl, is connected to the sme Fibre Chnnel network, nd is donor node cndidte. A node is donor node cndidte if the version of softwre tht is instlled on tht node supports the model type of the node tht is being rescued. 3. Ensure tht the Fibre Chnnel zoning llows connection between the node tht is being rescued nd the donor node cndidte. 4. Perform the problem determintion procedures for the network. v None 148 SAN Volume Controller: Troubleshooting Guide

163 Other: v Fibre Chnnel network problem 360 The 2145 is loding softwre from the donor. Explntion: donor. The 2145 is loding softwre from the If the progress br hs been stopped for t lest two minutes, restrt the node rescue procedure. v None 365 Cnnot lod SW from donor Explntion: None. None. 370 Instlling softwre Explntion: The 2145 is instlling softwre. 1. If this code is displyed nd the progress br hs been stopped for t lest ten minutes, the softwre instll process hs filed with n unexpected softwre error. 2. Power off the 2145 nd wit for 60 seconds. 3. Power on the The softwre updte opertion continues. 4. Report this problem immeditely to your Softwre Support Center. v None 500 Incorrect enclosure Explntion: The node cnister hs sved cluster informtion, which indictes tht the cnister is now locted in different enclosure from where it ws previously used. Using the node cnister in this stte might corrupt the dt held on the enclosure drives. Follow troubleshooting procedures to move the nodes to the correct loction. 1. Follow the Procedure: Getting node cnister nd system informtion using the service ssistnt tsk to review the node cnister sved loction informtion nd the sttus of the other node cnister in the enclosure (the prtner cnister). Determine if the enclosure is prt of n ctive system with volumes tht contin required dt. 2. If you hve unintentionlly moved the cnister into this enclosure, move the cnister bck to its originl loction, nd put the originl cnister bck in this enclosure. Follow the Replcing node cnister procedure. 3. If you hve intentionlly moved the node cnister into this enclosure you should check it is sfe to continue or whether you will lose dt on the enclosure you removed it from. Do not continue if the system the node cnister ws removed from is offline, rther return the node cnister to tht system. 4. If you hve determined tht you cn continue, follow the Procedure: Removing system dt from node cnister tsk to remove cluster dt from node cnister. 5. If the prtner node in this enclosure is not online, or is not present, you will hve to perform system recovery. Do not crete new system, you will lose ll the volume dt. Possible Cuse-FRUs or other cuse: v None 501 Incorrect slot Explntion: The node cnister hs sved cluster informtion, which indictes tht the cnister is not locted in the expected enclosure, but in different slot from where it ws previously used. Using the node cnister in this stte might men tht hosts re not ble to connect correctly. Follow troubleshooting procedures to relocte the node cnister to the correct loction. 1. Follow the Procedure: Getting node cnister nd system informtion using the service ssistnt tsk to review the node cnister sved loction informtion nd the sttus of the other node cnister in the enclosure (the prtner cnister). If the node cnister hs been indvertently swpped, the other node cnister will hve the sme error. 2. If the cnisters hve been swpped, use the Replcing node cnister procedure to swp the cnisters. The system should strt. 3. If the prtner cnister is in cndidte stte, use the hrdwre remove nd replce cnister procedure to swp the cnisters. The system should strt. 4. If the prtner cnister is in ctive stte, it is running the cluster on this enclosure nd hs replced the originl use of this cnister. Follow the Procedure: Removing system dt from node cnister tsk to remove cluster dt from this node cnister. The node cnister will then become ctive in the cluster in its current slot. 5. If the prtner cnister is in service stte, review its node error to determine the correct ction. Generlly, you will fix the errors reported on the prtner node in priority order, nd review the sitution gin fter ech chnge. If you hve to Chpter 6. Dignosing problems 149

164 replce the prtner cnister with new one, you should move this cnister bck to the correct loction t the sme time. v None 502 No enclosure identity exists nd sttus from the prtner node could not be obtined. Explntion: The enclosure hs been replced nd communiction with the other node cnister (prtner node) in the enclosure is not possible. The prtner node could be missing, powered off, unble to boot, or n internode communiction filure my exist. Follow troubleshooting procedures to configure the enclosure: 1. Follow the procedures to resolve problem to get the prtner node strted. An error will still exist becuse the enclosure hs no identity. If the error hs chnged, follow the service procedure for tht error. 2. If the prtner hs strted nd is showing loction error (probbly this one), then the PCI link is probbly broken. Since the enclosure midplne ws recently replced, this is likely the problem. Obtin replcement enclosure midplne, nd replce it. 3. If this ction does not resolve the issue, contct IBM Support Center. They will work with you to ensure tht the system stte dt is not lost while resolving the problem. Possible Cuse FRUs or other: v Enclosure midplne (100%) 503 Incorrect enclosure type Explntion: The node cnister hs been moved to n expnsion enclosure. A node cnister will not operte in this environment. This cn lso be reported when replcement node cnister is instlled for the first time. Follow troubleshooting procedures to relocte the nodes to the correct loction. 1. Follow the procedure Getting node cnister nd system informtion using USB flsh drive nd review the sved loction informtion of the node cnister to determine which control enclosure the node cnister belongs in. 2. Follow the procedure to move the node cnister to the correct loction, then follow the procedure to move the expnsion cnister tht is probbly in tht loction to the correct loction. If there is node cnister tht is in ctive stte where this node cnister must be, do not replce tht node cnister with this one. 504 No enclosure identity nd prtner node mtches. Explntion: The enclosure vitl product dt indictes tht the enclosure midplne hs been replced. This node cnister nd the other node cnister in the enclosure were previously operting in the sme enclosure midplne. Follow troubleshooting procedures to configure the enclosure. 1. This is n expected sitution during the hrdwre remove nd replce procedure for control enclosure midplne. Continue following the remove nd replce procedure nd configure the new enclosure. Possible Cuse FRUs or other: v None 505 No enclosure identity nd prtner hs system dt tht does not mtch. Explntion: The enclosure vitl product dt indictes tht the enclosure midplne hs been replced. This node cnister nd the other node cnister in the enclosure do not come from the sme originl enclosure. Follow troubleshooting procedures to relocte nodes to the correct loction. 1. Follow the Procedure: Getting node cnister nd system informtion using the service ssistnt tsk to review the node cnister sved loction informtion nd the sttus of the other node cnister in the enclosure (the prtner cnister). Determine if the enclosure is prt of n ctive system with volumes tht contin required dt. 2. Decide wht to do with the node cnister tht did not come from the enclosure tht is being replced.. If the other node cnister from the enclosure being replced is vilble, use the hrdwre remove nd replce cnister procedures to remove the incorrect cnister nd replce it with the second node cnister from the enclosure being replced. Restrt both cnisters. The two node cnister should show node error 504 nd the ctions for tht error should be followed. b. If the other node cnister from the enclosure being replced is not vilble, check the enclosure of the node cnister tht did not come from the replced enclosure. Do not use this cnister in this enclosure if you require the volume dt on the system from which the node cnister ws removed, nd tht system is not running with two online nodes. You should return the cnister to its originl enclosure nd use different cnister in this enclosure. c. When you hve checked tht it is not required elsewhere, follow the Procedure: Removing 150 SAN Volume Controller: Troubleshooting Guide

165 system dt from node cnister tsk to remove cluster dt from the node cnister tht did not come from the enclosure tht is being replced. d. Restrt both nodes. Expect node error 506 to be reported now, then follow the service procedures for tht error. Possible Cuse FRUs or other: v None 506 No enclosure identity nd no node stte on prtner Explntion: The enclosure vitl product dt indictes tht the enclosure midplne hs been replced. There is no cluster stte informtion on the other node cnister in the enclosure (the prtner cnister), so both node cnisters from the originl enclosure hve not been moved to this one. Follow troubleshooting procedures to relocte nodes to the correct loction: 1. Follow the procedure: Getting node cnister nd system informtion nd review the sved loction informtion of the node cnister nd determine why the second node cnister from the originl enclosure ws not moved into this enclosure. 2. If you re sure tht this node cnister cme from the enclosure tht is being replced, nd the originl prtner cnister is vilble, use the Replcing node cnister procedure to instll the second node cnister in this enclosure. Restrt the node cnister. The two node cnisters should show node error 504, nd the ctions for tht error should be followed. 3. If you re sure this node cnister cme from the enclosure tht is being replced, nd tht the originl prtner cnister hs filed, continue following the remove nd replce procedure for n enclosure midplne nd configure the new enclosure. Possible Cuse FRUs or other: v None 507 No enclosure identity nd no node stte Explntion: The node cnister hs been plced in replcement enclosure midplne. The node cnister is lso replcement or hs hd ll cluster stte removed from it. Follow troubleshooting procedures to relocte the nodes to the correct loction. 1. Check the sttus of the other node in the enclosure. Unless it lso shows error 507, check the errors on the other node nd follow the corresponding procedures to resolve the errors. It typiclly shows node error If the other node in the enclosure is lso reporting 507, the enclosure nd both node cnisters hve no stte informtion. Contct IBM support. They will ssist you in setting the enclosure vitl product dt nd running cluster recovery. v None 508 Cluster identifier is different between enclosure nd node Explntion: The node cnister loction informtion shows it is in the correct enclosure, however the enclosure hs hd new clustered system creted on it since the node ws lst shut down. Therefore, the clustered system stte dt stored on the node is not vlid. Follow troubleshooting procedures to correctly relocte the nodes. 1. Check whether new clustered system hs been creted on this enclosure while this cnister ws not operting or whether the node cnister ws recently instlled in the enclosure. 2. Follow the Procedure: Getting node cnister nd system informtion using the service ssistnt tsk, nd check the prtner node cnister to see if it is lso reporting node error 508. If it is, check tht the sved system informtion on this nd the prtner node mtch. If the system informtion on both nodes mtches, follow the Replcing control enclosure midplne procedure to chnge the enclosure midplne. 3. If this node cnister is the one to be used in this enclosure, follow the Procedure: Removing system dt from node cnister tsk to remove clustered system dt from the node cnister. It will then join the clustered system. 4. If this is not the node cnister tht you intended to use, follow the Replcing node cnister procedure to replce the node cnister with the one intended for use. Possible Cuse FRUs or other: v Service procedure error (90%) v Enclosure midplne (10%) 509 The enclosure identity cnnot be red. Explntion: The cnister ws unble to red vitl product dt (VPD) from the enclosure. The cnister requires this dt to be ble to initilize correctly. Follow troubleshooting procedures to fix the hrdwre: 1. Check errors reported on the other node cnister in this enclosure (the prtner cnister). Chpter 6. Dignosing problems 151

166 If it is reporting the sme error, follow the hrdwre remove nd replce procedure to replce the enclosure midplne. 3. If the prtner cnister is not reporting this error, follow the hrdwre remove nd replce procedure to replce this cnister. Note: If newly instlled system hs this error on both node cnisters, the dt tht needs to be written to the enclosure will not be vilble on the cnisters; contct IBM support for the WWNNs to use. Remember: Review the lsservicenodes output for wht the node is reporting. Possible Cuse FRUs or other: v Node cnister (50%) v Enclosure midplne (50%) 510 The detected memory size does not mtch the expected memory size. Explntion: The mount of memory detected in the node differs from the mount required for the node to operte s n ctive member of system. The error code dt shows the detected memory, in MB, followed by the minimum required memory, in MB. The next series of vlues indictes the mount of memory, in GB, detected in ech memory slot. Dt: v Detected memory in MB v Minimum required memory in MB v Memory in slot 1 in GB v Memory in slot 2 in GB v... v Memory in slot n in GB Check the memory size of nother 2145 tht is in the sme cluster. v Memory module (100%) 511 Memory bnk 1 of the 2145 is filing. For the 2145-DH8 only, the DIMMS re incorrectly instlled. Explntion: Memory bnk 1 of the 2145 is filing. For the 2145-DH8 only, the DIMMS re incorrectly instlled. This will degrde performnce. For the 2145-DH8 only, shut down the node nd djust the DIMM plcement s per the instll directions. v Memory module (100%) 512 Enclosure VPD is inconsistent Explntion: The enclosure midplne VPD is not consistent. The mchine prt number is not comptible with the mchine type nd model. This indictes tht the enclosure VPD is corrupted. 1. Check the support site for code updte. 2. Use the remove nd replce procedures to replce the enclosure midplne. Possible Cuse FRUs or other: v Enclosure midplne (100%) 521 Unble to detect Fibre Chnnel dpter Explntion: The system cnnot detect ny Fibre Chnnel dpters. Ensure tht Fibre Chnnel dpter hs been instlled. Ensure tht the Fibre Chnnel dpter is seted correctly in the riser crd. Ensure tht the riser crd is seted correctly on the system bord. If the problem persists, exchnge FRUs for new FRUs, one t time. 522 The system bord service processor hs filed. Explntion: bord filed. 1. Shutdown the node. The service processor on the system For the 2145-DH8 only: 2. Remove mins power cble. 3. Wit for lights to stop blinking. 4. Plug in power, nd then wit for the node to boot. 5. If tht fils, replce the system bord. Exchnge the FRU for new FRU DH8 v System bord ssembly (100%) 523 The internl disk file system is dmged. Explntion: The node strtup procedures hve found problems with the file system on the internl disk of the node. Follow troubleshooting procedures to relod the softwre. 1. Follow Procedure: Rescuing node cnister mchine code from nother node (node rescue). 152 SAN Volume Controller: Troubleshooting Guide

167 If the rescue node does not succeed, use the hrdwre remove nd replce procedures. Possible Cuse FRUs or other: v Node cnister (80%) v Other (20%) 524 Unble to updte BIOS settings. Explntion: Unble to updte BIOS settings. Power off node, wit 30 seconds, nd then power on gin. If the error code is still reported, replce the system bord. v System bord (100%) 525 Unble to updte system bord service processor firmwre. Explntion: The node strtup procedures hve been unble to updte the firmwre configurtion of the node. The updte might tke 90 minutes. 1. If the progress br hs been stopped for more thn 90 minutes, power off nd reboot the node. If the boot progress br stops gin on this code, replce the FRU shown. 2. If the power off or restrt does not work, try removing the power cords nd then restrting. 528 Ambient temperture is too high during system strtup. Explntion: The mbient temperture red during the node strtup procedures is too high for the node to continue. The strtup procedure will continue when the temperture is within rnge. system. Reduce the temperture round the 1. Resolve the issue with the mbient temperture by checking nd correcting:. Room temperture nd ir conditioning b. Ventiltion round the rck c. Airflow within the rck Possible Cuse FRUs or other: v Environment issue (100%) 530 A problem with one of the node's power supplies hs been detected. Explntion: The 530 error code is followed by two numbers. The first number is either 1 or 2 to indicte which power supply hs the problem. The second number is either 1, 2 or 3 to indicte the reson. 1 The power supply is not detected. 2 The power supply filed. 3 No input power is vilble to the power supply. If the node is member of cluster, the cluster reports error code 1096 or 1097, depending on the error reson. The error will utomticlly cler when the problem is fixed. 1. Ensure tht the power supply is seted correctly nd tht the power cble is ttched correctly to both the node nd to power source. 2. If the error hs not been utomticlly mrked fixed fter two minutes, note the sttus of the three LEDs on the bck of the power supply. 3. If the power supply error LED is off nd the AC nd DC power LEDs re both on, this is the norml condition. If the error hs not been utomticlly fixed fter two minutes, replce the system bord. 4. Follow the ction specified for the LED sttes noted in the list below. 5. If the error hs not been utomticlly fixed fter two minutes, contct support. Error, AC, DC: Action ON,ON or OFF,ON or OFF:The power supply hs fult. Replce the power supply. OFF,OFF,OFF:There is no power detected. Ensure tht the power cble is connected t the node nd to power source. If the AC LED does not light, check your power source. If you re connected to 2145 UPS-1U tht is showing n error, follow MAP UPS-1U. Otherwise, replce the power cble. If the AC LED still does not light, replce the power supply. OFF,OFF,ON:The power supply hs fult. Replce the power supply. OFF,ON,OFF:Ensure tht the power supply is instlled correctly. If the DC LED does not light, replce the power supply. Reson 1: A power supply is not detected. v Power supply (19%) v System bord (1%) v Other: Power supply is not instlled correctly (80%) Reson 2: The power supply hs filed. Chpter 6. Dignosing problems 153

168 v Power supply (90%) v Power cble ssembly (5%) v System bord (5%) Reson 3: There is no input power to the power supply. v Power cble ssembly (25%) v UPS-1U ssembly (4%) v System bord (1%) v Other: Power supply is not instlled correctly (70%) 534 System bord fult Explntion: There is unrecoverble error condition in device on the system bord. For storge enclosure, replce the cnister nd reuse the interfce dpters nd fns. For control enclosure, refer to the dditionl detils supplied with the error to determine the proper prts replcement sequence. v Pwr ril A: Replce CPU 1. Replce the power supply if the OVER SPEC LED on the light pth dignostics pnel is still lit. v Pwr ril B: Replce CPU 2. Replce the power supply if the OVER SPEC LED on the light pth dignostics pnel is still lit. v Pwr ril C: Replce the following components until "Pwr ril C" is no longer reported: DIMMs 1-6 PCI riser-crd ssembly 1 Fn 1 Optionl dpters tht re instlled in PCI riser-crd ssembly 1 Replce the power supply if the OVER SPEC LED on the light pth dignostics pnel is still lit. v Pwr ril D: Replce the following components until "Pwr ril D" is no longer reported: DIMMs 7-12 Fn 2 Optionl PCI dpter power cble Replce the power supply if the OVER SPEC LED on the light pth dignostics pnel is still lit. v Pwr ril E: Replce the following components until "Pwr ril E" is no longer reported: DIMMs Hrd disk drives Replce the power supply if the OVER SPEC LED on the light pth dignostics pnel is still lit. v Pwr ril F: Replce the following components until "Pwr ril F" is no longer reported: DIMMs Fn 4 Optionl dpters tht re instlled in PCI riser-crd ssembly 2 PCI riser-crd ssembly 2 Replce the power supply if the OVER SPEC LED on the light pth dignostics pnel is still lit. v Pwr ril G: Replce the following components until "Pwr ril G" is no longer reported: Hrd disk drive bckplne ssembly Hrd disk drives Fn 3 Optionl PCI dpter power cble v Pwr ril H: Replce the following components until "Pwr ril H" is no longer reported: Optionl dpters tht re instlled in PCI riser-crd ssembly 2 Optionl PCI dpter power cble Possible Cuse FRUs or other: v Hrdwre (100%) 535 Cnister internl PCIe switch filed Explntion: The PCI Express switch hs filed or cnnot be detected. In this sitution, the only connectivity to the node cnister is through the Ethernet ports. Follow troubleshooting procedures to fix the hrdwre. 536 The temperture of device on the system bord is greter thn or equl to the criticl threshold. Explntion: The temperture of device on the system bord is greter thn or equl to the criticl threshold. Check for externl nd internl ir flow blockges or dmge. 1. Remove the top of the mchine cse nd check for missing bffles, dmged het sinks, or internl blockges. 2. If the error persists, replce system bord. v None 538 The temperture of PCI riser crd is greter thn or equl to the criticl threshold. Explntion: The temperture of PCI riser crd is greter thn or equl to the criticl threshold. Improve cooling. 1. If the problem persists, replce the PCI riser 154 SAN Volume Controller: Troubleshooting Guide

169 v None v None 541 Multiple, undetermined, hrdwre errors Explntion: Multiple hrdwre filures were reported on the dt pths within the node, nd the threshold of the number of cceptble errors within given time frme ws reched. It ws not possible to isolte the errors to single component. After this node error is rised, ll ports on the node re dectivted. The node is considered unstble, nd hs the potentil to corrupt dt. 1. Follow the procedure for collecting informtion for support, nd contct your support orgniztion. 2. A softwre updte my resolve the issue. 3. Replce the node. 542 An instlled CPU hs filed or been removed. Explntion: removed. An instlled CPU hs filed or been Replce the CPU. v CPU (100%) 543 None of the node seril numbers tht re stored in the three loctions mtch. Explntion: When the system softwre strts, it reds the node seril number from the system bord nd compres this seril number to the node seril numbers stored on the two boot drives. There must be t lest two mtching node seril numbers for the system softwre to ssume tht node seril number is good. Look t boot drive view for the node to work out wht to do. 1. Replce missing or filed drives. 2. Put ny drive tht belongs to different node bck where it belongs. 3. If you intend to use drive from different node in this node from now on, the node error chnges to different node error when the other drive is replced. 4. If you replced the system bord, then the pnel nme is now , nd if you replced one of the drives, then the slot sttus of tht drive is uninitilized. If the node seril number of the other boot drive mtches the MT-M S/N lbel on the front of the node, then run stsk rescuenode to initilize the uninitilized drive. Initilizing the drive should led to the 545 node error. 544 Boot drives re from other nodes. Explntion: Boot drives re from other nodes. Look t boot drive view for the node to determine wht to do. 1. Put ny drive tht belongs to different node bck where it belongs. 2. If you intend to use drive from different node in this node from now on, the node error chnges to different node error when the other drive is replced. 3. See error code 1035 for dditionl informtion regrding boot drive problems. v None 545 The node seril number on the boot drives mtch ech other, but they do not mtch the product seril number on the system bord. Explntion: The node seril number on the boot drives mtch ech other, but they do not mtch the product seril number on the system bord. Check the S/N vlue on the MT-M S/N lbel on the front of the node. Look t boot drive view to see the node seril number of the system bord nd the node seril number of ech drive. 1. Replce the boot drives with the correct boot drives if needed. 2. Set the system bord seril number using the following commnd: stsk chvpd -type <vlue> -seril <S/N vlue from the MT-M S/N lbel> v None 547 Pluggble TPM is missing or broken. Explntion: The Trusted Pltform Module (TPM) for the system is not functioning. Importnt: Confirm tht the system is running on t lest one other node before you commence this repir. Ech node uses its TPM to securely store encryption keys on its boot drive. When the TPM or boot drive of node is replced, the node loses its encryption key, nd must be ble to join n existing system to obtin the keys. If this error occurred on the lst node in Chpter 6. Dignosing problems 155

170 system, do not replce the TPM, boot drive, or node hrdwre until the system contins t lest one online node with vlid keys. 1. Shut down the node nd remove the node hrdwre. 2. Locte the TPM in the node hrdwre nd ensure tht it is correctly seted. 3. Reinsert the node hrdwre nd pply power to the node. 4. If the error persists, replce the TPM with one from FRU stock. 5. If the error persists, replce the system bord or the node hrdwre with one from FRU stock. You do not need to return the fulty TPM to IBM. Note: It is unlikely tht the filure of TPM cn cuse the loss of the System Mster Key (SMK): v The SMK is seled by the TPM, using its unique encryption key, nd the result is stored on the system boot drive. v The working copy of the SMK is on the RAM disk, nd so is unffected by sudden TPM filure. v If the filure hppens t boot time, the node is held in n unrecoverble error stte becuse the TPM is FRU. v The SMK is lso mirrored by the other nodes in the system. When the node with replcement TPM joins the system, it determines tht it does not hve the SMK, requests it, gets it, nd then sels with the new TPM. 550 A clustered system cnnot be formed becuse of lck of clustered system resources. Explntion: The node cnnot become ctive becuse it is unble to connect to enough system resources. The system resources re the node in the system nd the ctive quorum disk or drive. The node must be ble to connect to most of the resources before tht group forms n online system. This connection prevents the system from splitting into two or more ctive prts, with both prts independently performing I/O. The error dt lists the missing resources. This informtion includes list of nodes nd optionlly drive tht is operting s the quorum drive or LUN on n externl storge system tht is operting s the quorum disk. If drive in one of the system enclosures is the missing quorum disk, it is listed s enclosure:slot[prt identifiction] where enclosure:slot is the loction of the drive when the node shutdown, enclosure is the seven-digit product seril number of the enclosure, slot is number The prt identifiction is the 22 chrcter string tht strts with "11S" found on lbel on drive. The prt identifiction cnnot be seen until the drive is removed from the enclosure. If LUN on n externl storge system is the missing quorum disk, it is listed s WWWWWWWWWWWWWWWW/LL, where WWWWWWWWWWWWWWWW is worldwide port nme (WWPN) on the storge system tht contins the missing quorum disk nd LL is the Logicl Unit Number (LUN). If the system topology is stretched nd the number of opertionl nodes is less thn hlf, then node error 550 is displyed. In this cse, the Site Disster Recovery feture cnnot be used s the number of opertionl nodes is less thn the quorum required to crete the system tht uses the Site Disster Recovery feture. Follow troubleshooting procedures to correct connectivity issues between the nodes nd the quorum devices. 1. Check for ny node errors tht indicte issues with Fibre Chnnel connectivity. Resolve ny issues. 2. Ensure tht the other nodes in the system re powered on nd opertionl. 3. Check the Fibre Chnnel port sttus. If ny port is not ctive, run the Fibre Chnnel port problem determintion procedures. 4. Ensure tht Fibre Chnnel network zoning chnges hve not restricted communiction between nodes or between the nodes nd the quorum disk. 5. Run the problem determintion procedures for the network. 6. The quorum disk filed or cnnot be ccessed. Run the problem determintion procedures for the disk controller. 551 A cluster cnnot be formed becuse of lck of cluster resources. Explntion: The node does not hve sufficient connectivity to other nodes or the quorum device to form cluster. Attempt to repir the fbric or quorum device to estblish connectivity. If disster occurred nd the nodes t the other site cnnot be recovered, then it is possible to llow the nodes t the surviving site to form system by using locl storge. Follow troubleshooting procedures to correct connectivity issues between the cluster nodes nd the quorum devices. 1. Check for ny node errors tht indicte issues with Fibre Chnnel connectivity. Resolve ny issues. 2. Ensure tht the other nodes in the cluster re powered on nd opertionl. 3. Using the SAT GUI or CLI (sinfo lsservicesttus), disply the Fibre Chnnel port sttus. If ny port is not ctive, perform the Fibre Chnnel port problem determintion procedures. 156 SAN Volume Controller: Troubleshooting Guide

171 Ensure tht Fibre Chnnel network zoning chnges hve not restricted communiction between nodes or between the nodes nd the quorum disk. 5. Perform the problem determintion procedures for the network. 6. The quorum disk filed or cnnot be ccessed. Perform the problem determintion procedures for the disk controller. 7. As lst resort when the nodes t the other site cnnot be recovered, then it is possible to llow the nodes t the surviving site to form system by using locl site storge: To void dt corruption ensure tht ll host servers tht were previously ccessing the system hve hd ll volumes unmounted or hve been rebooted. Ensure tht the nodes t the other site re not opertionl nd re unble to form system in the future. After strting this commnd, full resynchroniztion of ll mirrored volumes is completed when the other site is recovered. This is likely to tke mny hours or dys to complete. Contct IBM support personnel if you re unsure. Note: Before continuing, confirm tht you hve tken the following ctions - filure to perform these ctions cn led to dt corruption tht is undetected by the system but ffects host pplictions.. All host servers tht were previously ccessing the system hve hd ll volumes unmounted or hve been rebooted. b. Ensure tht the nodes t the other site re not operting s system nd ctions hve been tken to prevent them from forming system in the future. After these ctions hve been tken, the stsk overridequorum cn be used to llow the nodes t the surviving site to form system tht uses locl storge. 555 Power Domin error Explntion: Both 2145s in n I/O group tht re being powered by the sme uninterruptible power supply. The ID of the other 2145 is displyed with the node error code on the front pnel. Ensure tht the configurtion is correct nd tht ech 2145 is in n I/O group is connected from seprte uninterruptible power supply. Controller system, the first 11 digits re C0 for DH8 nd F0 for SV1. The lst 5 digits of the WWNN re given in the dditionl dt of the error. For more informtion, see "Service ssistnt interfce." The Fibre Chnnel ports of the node re disbled to prevent disruption of the Fibre Chnnel network. One or both nodes with the sme WWNN cn show the error. Becuse of the wy WWNNs re llocted, device with duplicte WWNN is normlly nother SAN Volume Controller node. Follow troubleshooting procedures to configure the WWNN of the node: 1. Find the cluster node with the sme WWNN s the node reporting the error. The WWNN for cluster node cn be found from the node Vitl Product Dt (VPD) or from the node detils shown by the service ssistnt.. The node with the duplicte WWNN need not be prt of the sme cluster s the node reporting the error; it could be remote from the node reporting the error on prt of the fbric connected through n inter-switch link. 2. If cluster node with duplicte WWNN is found, determine whether it, or the node reporting the error, hs the incorrect WWNN. Also consider how the SAN is zoned when mking your decision. 3. Determine the correct WWNN for the node with the incorrect WWNN. If the correct WWNN cnnot be determined contct your support representtive for ssistnce. 4. Use the service ssistnt to modify the incorrect WWNN. If it is the node showing the error tht should be modified, this cn sfely be done immeditely. If it is n ctive node tht should be modified, use cution becuse the node will restrt when the WWNN is chnged. If this node is the only opertionl node in n I/O group, ccess to the volumes tht it is mnging will be lost. You should ensure tht the host systems re in the correct stte before you chnge the WWNN. 5. If the node showing the error hd the correct WWNN, it cn be restrted, using the front pnel power control button, fter the node with the duplicte WWNN is updted. 6. If you re unble to find cluster node with the sme WWNN s the node showing the error, use the SAN monitoring tools to determine whether there is nother device on the SAN with the sme WWNN. This device should not be using WWNN ssigned to cluster, so you should follow the service procedures for the device to chnge its WWNN. Once the duplicte hs been removed, restrt the node. 556 A duplicte WWNN hs been detected. Explntion: The node hs detected nother device tht hs the sme World Wide Node Nme (WWNN) on the Fibre Chnnel network. A WWNN is 16 hexdeciml digits long. For the SAN Volume 558 The node is unble to communicte with other nodes. Explntion: The system cnnot see the Fibre Chnnel fbric or the Fibre Chnnel dpter port speed might Chpter 6. Dignosing problems 157

172 be set to different speed thn tht of the Fibre Chnnel fbric. Ensure tht: 1. The Fibre Chnnel network fbric switch is powered-on. 2. At lest one Fibre Chnnel cble connects the system to the Fibre Chnnel network fbric. 3. The Fibre Chnnel dpter port speed is equl to tht of the Fibre Chnnel fbric. 4. At lest one Fibre Chnnel dpter is instlled in the system. 5. Go to the Fibre Chnnel MAP. v None 560 Bttery cbling fult Explntion: A fult exists in one of the cbles connecting the bttery bckplne to the rest of the system. Follow troubleshooting procedures to fix the hrdwre: 1. Reset the cble. 2. If reseting the cble does not fix the problem, replce the cble. 3. If replcing the cble does not fix the problem, replce the bttery bckplne. 561 Bttery bckplne or cbling fult Explntion: Either the bttery bckplne hs filed, or the power or LPC cbles connecting the bttery bckplne to the rest of the system re not connected properly. Follow troubleshooting procedures to fix the hrdwre: 1. Check the cbles connecting the bttery bckplne. 2. Reset the power nd LPC cbles. 3. If reseting the cbles does not fix the problem, replce the cbles. 4. Once the cbles re well connected, but the problem persists, replce the bttery bckplne. 5. Conduct the corrective service procedure described in 1108 on pge The nodes hrdwre configurtion does not meet the minimum requirements Explntion: The node hrdwre is not t the minimum specifiction for the node to become ctive in cluster. This my be becuse of hrdwre filure, but is lso possible fter service ction hs used n incorrect replcement prt. Follow troubleshooting procedures to fix the hrdwre: 1. View node VPD informtion, to see whether nything looks inconsistent. Compre the filing node VPD with the VPD of working node of the sme type. Py prticulr ttention to the number nd type of CPUs nd memory. 2. Replce ny incorrect prts. 564 Too mny mchine code crshes hve occurred. Explntion: The node hs been determined to be unstble becuse of multiple resets. The cuse of the resets cn be tht the system encountered n unexpected stte or hs executed instructions tht were not vlid. The node hs entered the service stte so tht dignostic dt cn be recovered. The node error does not persist cross restrts of the mchine code on the node. Follow troubleshooting procedures to relod the mchine code: 1. Get support pckge (snp), including dumps, from the node, using the mngement GUI or the service ssistnt. 2. If more thn one node is reporting this error, contct IBM technicl support for ssistnce. The support pckge from ech node will be required. 3. Check the support site to see whether the issue is known nd whether mchine code updte exists to resolve the issue. Updte the cluster mchine code if resolution is vilble. Use the mnul updte process on the node tht reported the error first. 4. If the problem remins unresolved, contct IBM technicl support nd send them the support pckge. Possible Cuse FRUs or other: v None 565 The internl drive of the node is filing. Explntion: The internl drive within the node is reporting too mny errors. It is no longer sfe to rely on the integrity of the drive. Replcement is recommended. Follow troubleshooting procedures to fix the hrdwre: 1. View hrdwre informtion. 2. Replce prts (cnister or disk). 158 SAN Volume Controller: Troubleshooting Guide

173 At boot time: the CPU reched temperture tht is greter thn or equl to the wrning threshold. During norml running: the CPU reched temperture tht is greter thn or equl to the criticl threshold. Explntion: At boot time: the CPU reched temperture tht is greter thn or equl to the wrning threshold. During norml running: the CPU reched temperture tht is greter thn or equl to the criticl threshold. Check for externl nd internl ir flow blockges or dmge. 1. Remove the top of the mchine cse nd check for missing bffles, dmged het sinks, or internl blockges. 2. If problem persists, replce the CPU/het sink. v CPU v Het sink 570 Bttery protection unvilble Explntion: The node cnnot strt becuse bttery protection is not vilble. Both btteries require user intervention before they cn become vilble. fix hrdwre. Follow troubleshooting procedures to The pproprite service ction will be indicted by n ccompnying non-ftl node error. Exmine the event log to determine the ccompnying node error. 571 Bttery protection temporrily unvilble; one bttery is expected to be vilble soon Explntion: The node cnnot strt becuse bttery protection is not vilble. One bttery is expected to become vilble shortly with no user intervention required, but the other bttery will not become vilble. fix hrdwre. Follow troubleshooting procedures to The pproprite service ction will be indicted by n ccompnying non-ftl node error. Exmine the event log to determine the ccompnying node error. 572 Bttery protection temporrily unvilble; both btteries re expected to be vilble soon Explntion: The node cnnot strt becuse bttery protection is not vilble. Both btteries re expected to become vilble shortly with no user intervention required. Wit for sufficient bttery chrge for enclosure to strt. 573 The node mchine code is inconsistent. Explntion: Prts of the node mchine code pckge re receiving unexpected results; there my be n inconsistent set of subpckges instlled, or one subpckge my be dmged. Follow troubleshooting procedures to relod the mchine code. 1. Follow the procedure to run node rescue. 2. If the error occurs gin, contct IBM technicl support. Possible Cuse FRUs or other: v None 574 The node mchine code is dmged. Explntion: A checksum filure hs indicted tht the node mchine code is dmged nd needs to be reinstlled. 1. If the other nodes re opertionl, run node rescue; otherwise, instll new mchine code using the service ssistnt. Node rescue filures, s well s the repeted return of this node error fter reinstlltion, re symptomtic of hrdwre fult with the node. Possible Cuse FRUs or other: v None 576 The cluster stte nd configurtion dt cnnot be red. Explntion: The node ws unble to red the sved cluster stte nd configurtion dt from its internl drive becuse of red or medium error. t time. Exchnge the FRUs for new FRUs, one 578 The stte dt ws not sved following power loss. Explntion: On strtup, the node ws unble to red its stte dt. When this hppens, it expects to be utomticlly dded bck into clustered system. However, if it is not joined to clustered system in 60 sec, it rises this node error. This error is criticl node error, nd user ction is required before the node cn become cndidte to join clustered system. Follow troubleshooting procedures to correct connectivity issues between the clustered system nodes nd the quorum devices. Chpter 6. Dignosing problems 159

174 Mnul intervention is required once the node reports this error. 2. Attempt to reestblish the clustered system by using other nodes. This step might involve fixing hrdwre issues on other nodes or fixing connectivity issues between nodes. 3. If you re ble to reestblish the clustered system, remove the system dt from the node tht shows error 578 so it goes to cndidte stte. It is then utomticlly dded bck to the clustered system.. To remove the system dt from the node, go to the service ssistnt, select the rdio button for the node with 578, click Mnge System, nd then choose Remove System Dt. b. Or use the CLI commnd stsk levecluster -force. If the node does not utomticlly dd bck to the clustered system, note the nme nd I/O group of the node, nd then delete the node from the clustered system configurtion (if this hs not lredy hppened). Add the node bck to the clustered system using the sme nme nd I/O group. 4. If ll nodes hve either node error 578 or 550, follow the recommended user response for node error Attempt to determine wht cused the nodes to shut down. Possible Cuse FRUs or other: v None 579 Bttery subsystem hs insufficient chrge to sve system dt Explntion: Not enough cpcity is vilble from the bttery subsystem to sve system dt in response to series of bttery nd boot-drive fults. fix hrdwre. Follow troubleshooting procedures to The pproprite service ctions re indicted by the series of bttery nd boot-drive fults. Exmine the event log to determine the ccompnying fults. Service the other fults. 588 The 2145 UPS-1U is not cbled correctly. Explntion: The signl cble or the 2145 power cbles re probbly not connected correctly. The power cble nd signl cble might be connected to different 2145 UPS-1U ssemblies. 1. Connect the cbles correctly. 2. Restrt the node. v None. Other: v Cbling error (100%) 590 Repetitive node trnsitions into stndby mode from norml mode becuse of power subsystem-relted node errors. Explntion: Multiple node restrts occurred becuse of 2145 UPS-1U errors, which cn be reported on ny node type This error mens tht the node mde the trnsition into stndby from norml mode becuse of power subsystem-relted node errors too mny times within short period. Too mny times re defined s three, nd short period is defined s 1 hour. This error lerts the user tht something might be wrong with the power subsystem s it is clerly not norml for the node to repetedly go in nd out of stndby. If the ctions of the tester or engineer re expected to cuse mny frequent trnsitions from norml to stndby nd bck, then this error does not imply tht there is ny ctul fult with the system. Follow troubleshooting procedures to fix the hrdwre: 1. Verify tht the room temperture is within specified limits nd tht the input power is stble. 2. If 2145 UPS-1U is connected, verify tht the 2145 UPS-1U signl cble is fstened securely t both ends. 3. Look in the system event log for the node error tht is repeting. Note: The condition is reset by powering off the node from the node front pnel. 650 The cnister bttery is not supported Explntion: The cnister bttery shows product dt tht indictes it cnnot be used with the code version of the cnister. This is resolved by either obtining bttery which is supported by the system's code level, or the cnister's code level is updted to level which supports the bttery. 1. Remove the cnister nd its lid nd check the FRU prt number of the new bttery mtches tht of the replced bttery. Obtin the correct FRU prt if it does not. 2. If the cnister hs just been replced, check the code level of the prtner node cnister nd use the service ssistnt to updte this cnister's code level to the sme level. Possible cuse FRUs or other cuse v cnister bttery 160 SAN Volume Controller: Troubleshooting Guide

175 The cnister bttery is missing Explntion: The cnister bttery cnnot be detected. 1. Use the remove nd replce procedures to remove the node cnister nd its lid. 2. Use the remove nd replce procedures to instll bttery. 3. If bttery is present, ensure tht it is fully inserted. Replce the cnister. 4. If this error persists, use the remove nd replce procedures to replce the bttery. Possible cuse FRUs or other cuse v Cnister bttery 652 The cnister bttery hs filed Explntion: The cnister bttery hs filed. The bttery my be showing n error stte, it my hve reched the end of life, or it my hve filed to chrge. Dt Number indictors with filure resons v 1 bttery reports filure v 2 end of life v 3 filure to chrge 1. Use the remove nd replce procedures to replce the bttery. Possible cuse FRUs or other cuse v cnister bttery 653 The cnister bttery s temperture is too low Explntion: The cnister bttery s temperture is below its minimum operting temperture. v Wit for the bttery to wrm up, the error will cler when its minimum working temperture is reched. v If the error persists for more thn n hour when the mbient temperture is norml, use the remove nd replce procedures to replce the bttery. Possible cuse FRUs or other cuse v cnister bttery 654 The cnister bttery s temperture is too high Explntion: The cnister bttery s temperture is bove its sfe operting temperture. v If necessry, reduce the mbient temperture. v Wit for the bttery to cool down, the error will cler when norml working temperture is reched. Keep checking the reported error s the system my determine the bttery hs filed. v If the node error persists for more thn two hours fter the mbient temperture returns to the norml operting rnge, use the remove nd replce procedures to replce the bttery. Possible cuse FRUs or other cuse v cnister bttery 655 Cnister bttery communictions fult. Explntion: the bttery. The cnister cnnot communicte with v Use the remove nd replce procedures to replce the bttery. v If the node error persists, use the remove nd replce procedures to replce the node cnister. Possible Cuse-FRUs or other cuse: v Cnister bttery v Node cnister 656 The cnister bttery hs insufficient chrge Explntion: The cnister bttery hs insufficient chrge to sve the cnister s stte nd cche dt to the internl drive if power were to fil. v Wit for the bttery to chrge, the bttery does not need to be fully chrged for the error to utomticlly cler. Possible cuse FRUs or other cuse v none 657 Not enough bttery chrge to support grceful shutdown of the storge enclosure. Explntion: enclosure. Insufficient power vilble for the If bttery is missing, filed or hving communiction error, replce the bttery. If bttery is filed, replce the bttery. If bttery is chrging, this error should go wy when the bttery is chrged. If bttery is too hot, the system cn be strted fter it hs cooled. If running on single power supply with low input power (110 V AC), "low voltge" will be seen in the Chpter 6. Dignosing problems 161

176 extr dt. If this is the cse, the filed or missing power supply should be replced. This will only hppen if single power supply is running with input power tht is too low. 668 The remote setting is not vilble for users for the current system. Explntion: On the current systems, users cnnot be set to remote. Any user defined on the system must be locl user. To crete remote user the user must not be defined on the locl system. 670 The UPS bttery chrge is not enough to llow the node to strt. Explntion: The uninterruptible power supply connected to the node does not hve sufficient bttery chrge for the node to sfely become ctive in cluster. The node will not strt until sufficient chrge exists to store the stte nd configurtion dt held in the node memory if power were to fil. The front pnel of the node will show "chrging". Wit for sufficient bttery chrge for enclosure to strt: 1. Wit for the node to utomticlly fix the error when there is sufficient chrge. 2. Ensure tht no error conditions re indicted on the uninterruptible power supply. 671 The vilble bttery chrge is not enough to llow the node cnister to strt. Two btteries re chrging. Explntion: The bttery chrge within the enclosure is not sufficient for the node to sfely become ctive in cluster. The node will not strt until sufficient chrge exists to store the stte nd configurtion dt held in the node cnister memory if power were to fil. Two btteries re within the enclosure, one in ech of the power supplies. Neither of the btteries indicte n error both re chrging. The node will strt utomticlly when sufficient chrge is vilble. The btteries do not hve to be fully chrged before the nodes cn become ctive. Both nodes within the enclosure shre the bttery chrge, so both node cnisters report this error. The service ssistnt shows the estimted strt time in the node cnister hrdwre detils. Wit for the node to utomticlly fix the error when sufficient chrge becomes vilble. 672 The vilble bttery chrge is not enough to llow the node cnister to strt. One bttery is chrging. Explntion: The bttery chrge within the enclosure is not sufficient for the node to sfely become ctive in cluster. The node will not strt until sufficient chrge exists to store the stte nd configurtion dt held in the node cnister memory if power were to fil. Two btteries re within the enclosure, one in ech of the power supplies. Only one of the btteries is chrging, so the time to rech sufficient chrge will be extended. The node will strt utomticlly when sufficient chrge is vilble. The btteries do not hve to be fully chrged before the nodes cn become ctive. Both nodes within the enclosure shre the bttery chrge, so both node cnisters report this error. The service ssistnt shows the estimted strt time, nd the bttery sttus, in the node cnister hrdwre detils. v None 1. Wit for the node to utomticlly fix the error when sufficient chrge becomes vilble. 2. If possible, determine why one bttery is not chrging. Use the bttery sttus shown in the node cnister hrdwre detils nd the indictor LEDs on the PSUs in the enclosure to dignose the problem. If the issue cnnot be resolved, wit until the cluster is opertionl nd use the troubleshooting options in the mngement GUI to ssist in resolving the issue. v Bttery (33%) v Control power supply (33%) v Power cord (33%) 673 The vilble bttery chrge is not enough to llow the node cnister to strt. No btteries re chrging. Explntion: A node cnnot be in ctive stte if it does not hve sufficient bttery power to store configurtion nd cche dt from memory to internl disk fter power filure. The system hs determined tht both btteries hve filed or re missing. The problem with the btteries must be resolved to llow the system to strt. fix hrdwre: Follow troubleshooting procedures to 1. Resolve problems in both btteries by following the procedure to determine sttus using the LEDs. 162 SAN Volume Controller: Troubleshooting Guide

177 If the LEDs do not show fult on the power supplies or btteries, power off both power supplies in the enclosure nd remove the power cords. Wit 20 seconds, then replce the power cords nd restore power to both power supplies. If both node cnisters continue to report this error replce the enclosure chssis. v Bttery (33%) v Power supply (33%) v Power cord (33%) v Enclosure chssis (1%) 674 The cycling mode of Metro Mirror object cnnot be chnged. Explntion: The cycling mode my only be set for Globl Mirror objects. Metro Mirror objects cnnot hve cycling mode defined. The object's type must be set to 'globl' before or when setting the cycling mode. 690 The node is held in the service stte. Explntion: The node is in service stte nd hs been instructed to remin in service stte. While in service stte, the node will not run s prt of cluster. A node must not be in service stte for longer thn necessry while the cluster is online becuse loss of redundncy will result. A node cn be set to remin in service stte either becuse of service ssistnt user ction or becuse the node ws deleted from the cluster. When it is no longer necessry to hold the node in the service stte, exit the service stte to llow the node to run: 1. Use the service ssistnt ction to relese the service stte. Possible Cuse FRUs or other: v none 700 The Fibre Chnnel dpter tht ws previously present hs not been detected. Explntion: A Fibre Chnnel dpter tht ws previously present hs not been detected. The dpter might not be correctly instlled, or it might hve filed. This node error does not, in itself, stop the node cnister from becoming ctive in the system; however, the Fibre Chnnel network might be being used to communicte between the node cnisters in clustered system. It is possible tht this node error indictes why the criticl node error 550 A cluster cnnot be formed becuse of lck of cluster resources is reported on the node cnister. Dt: v Loction A number indicting the dpter loction. The loction indictes n dpter slot, see the node cnister description for the definition of the dpter slot loctions 1. If possible, this noncriticl node error should be serviced using the mngement GUI nd running the recommended ctions for the service error code. 2. There re number of possibilities.. If you hve delibertely removed the dpter (possibly replcing it with different dpter type), you will need to follow the mngement GUI recommended ctions to mrk the hrdwre chnge s intentionl. b. If the previous steps hve not isolted the problem, use the remove nd replce procedures to replce the dpter, if this does not fix the problem replce the system bord. Possible Cuse FRUs or other cuse: v Fibre Chnnel dpter v System bord 701 A Fibre Chnnel dpter hs filed. Explntion: A Fibre Chnnel dpter hs filed. This node error does not, in itself, stop the node becoming ctive in the system. However, the Fibre Chnnel network might be being used to communicte between the nodes in clustered system. Therefore, it is possible tht this node error indictes the reson why the criticl node error 550 A cluster cnnot be formed becuse of lck of cluster resources is reported on the node. Dt: v A number indicting the dpter loction. The loction indictes n dpter slot. See the node description for the definition of the dpter slot loctions. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Use the remove nd replce procedures to replce the dpter. If this does not fix the problem, replce the system bord. Possible Cuse-FRUs or other cuse: v Fibre Chnnel dpter v System bord Chpter 6. Dignosing problems 163

178 A Fibre Chnnel dpter hs PCI error. Explntion: A Fibre Chnnel dpter hs PCI error. This node error does not, in itself, stop the node from becoming ctive in the system. However, the Fibre Chnnel network might be being used to communicte between the nodes in clustered system. Therefore, it is possible tht this node error indictes the reson why the criticl node error 550 A cluster cnnot be formed becuse of lck of cluster resources is reported on the node. Dt: v A number indicting the dpter loction. The loction indictes n dpter slot. See the node description for the definition of the dpter slot loctions. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Use the remove nd replce procedures to replce the dpter. If this does not fix the problem, replce the system bord. Possible Cuse-FRUs or other cuse: v Fibre Chnnel dpter v System bord 703 A Fibre Chnnel dpter is degrded. Explntion: A Fibre Chnnel dpter is degrded. This node error does not, in itself, stop the node becoming ctive in the system. However, the Fibre Chnnel network might be being used to communicte between the nodes in clustered system. Therefore, it is possible tht this node error indictes the reson why the criticl node error 550 A cluster cnnot be formed becuse of lck of cluster resources is reported on the node. Dt: v A number indicting the dpter loction. The loction indictes n dpter slot. See the node description for the definition of the dpter slot loctions. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Use the remove nd replce procedures to replce the dpter. If this does not fix the problem, replce the system bord. Possible Cuse FRUs or other cuse: v Fibre Chnnel dpter v System bord 704 Fewer Fibre Chnnel ports opertionl. Explntion: A Fibre Chnnel port tht ws previously opertionl is no longer opertionl. The physicl link is down. This node error does not, in itself, stop the node becoming ctive in the system. However, the Fibre Chnnel network might be being used to communicte between the nodes in clustered system. Therefore, it is possible tht this node error indictes the reson why the criticl node error 550 A cluster cnnot be formed becuse of lck of cluster resources is reported on the node. Dt: Three numeric vlues re listed: v The ID of the first unexpected inctive port. This ID is deciml number. v The ports tht re expected to be ctive, which is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is expected to be ctive. v The ports tht re ctully ctive, which is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is ctive. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Possibilities: v If the port hs been intentionlly disconnected, use the mngement GUI recommended ction for the service error code nd cknowledge the intended chnge. v Check tht the Fibre Chnnel cble is connected t both ends nd is not dmged. If necessry, replce the cble. v Check the switch port or other device tht the cble is connected to is powered nd enbled in comptible mode. Rectify ny issue. The device service interfce might indicte the issue. v Use the remove nd replce procedures to replce the SFP trnsceiver in the 2145 node nd the SFP trnsceiver in the connected switch or device. v Use the remove nd replce procedures to replce the dpter. Possible Cuse-FRUs or other cuse: v Fibre Chnnel cble v SFP trnsceiver v Fibre Chnnel dpter 164 SAN Volume Controller: Troubleshooting Guide

179 Fewer Fibre Chnnel I/O ports opertionl. Explntion: One or more Fibre Chnnel I/O ports tht hve previously been ctive re now inctive. This sitution hs continued for one minute. A Fibre Chnnel I/O port might be estblished on either Fibre Chnnel pltform port or n Ethernet pltform port using FCoE. This error is expected if the ssocited Fibre Chnnel or Ethernet port is not opertionl. Dt: Three numeric vlues re listed: v The ID of the first unexpected inctive port. This ID is deciml number. v The ports tht re expected to be ctive, which is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is expected to be ctive. v The ports tht re ctully ctive, which is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is ctive. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Follow the procedure for mpping I/O ports to pltform ports to determine which pltform port is providing this I/O port. 3. Check for ny 704 (Fibre chnnel pltform port not opertionl) or 724 (Ethernet pltform port not opertionl) node errors reported for the pltform port. 4. Possibilities: v If the port hs been intentionlly disconnected, use the mngement GUI recommended ction for the service error code nd cknowledge the intended chnge. v Resolve the 704 or 724 error. v If this is n FCoE connection, use the informtion the view gives bout the Fibre Chnnel forwrder (FCF) to troubleshoot the connection between the port nd the FCF. Possible Cuse-FRUs or other cuse: v None 706 Fibre Chnnel clustered system pth filure. Explntion: One or more Fibre Chnnel (FC) input/output (I/O) ports tht hve previously been ble to see ll required online nodes cn no longer see them. This sitution hs continued for 5 minutes. This error is not reported unless node is ctive in clustered system. A Fibre Chnnel I/O port might be estblished on either FC pltform port or n Ethernet pltform port using Fiber Chnnel over Ethernet (FCoE). Dt: Three numeric vlues re listed: v The ID of the first FC I/O port tht does not hve connectivity. This is deciml number. v The ports tht re expected to hve connections. This is hexdeciml number, nd ech bit position represents port - with the lest significnt bit representing port 1. The bit is 1 if the port is expected to hve connection to ll online nodes. v The ports tht ctully hve connections. This is hexdeciml number, ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port hs connection to ll online nodes. 1. If possible, this noncriticl node error should be serviced using the mngement GUI nd running the recommended ctions for the service error code. 2. Follow the procedure: Mpping I/O ports to pltform ports to determine which pltform port does not hve connectivity. 3. There re number of possibilities. v If the port s connectivity hs been intentionlly reconfigured, use the mngement GUI recommended ction for the service error code nd cknowledge the intended chnge. You must hve t lest two I/O ports with connections to ll other nodes. v Resolve other node errors relting to this pltform port or I/O port. v Check tht the SAN zoning is correct. Possible Cuse: FRUs or other cuse: v None. 710 The SAS dpter tht ws previously present hs not been detected. Explntion: A SAS dpter tht ws previously present hs not been detected. The dpter might not be correctly instlled or it might hve filed. Dt: v A number indicting the dpter loction. The loction indictes n dpter slot. See the node description for the definition of the dpter slot loctions. Chpter 6. Dignosing problems 165

180 If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Possibilities: v If the dpter hs been intentionlly removed, use the mngement GUI recommended ctions for the service error code, to cknowledge the chnge. v Use the remove nd replce procedures to remove nd open the node nd check the dpter is fully instlled. v If the previous steps hve not isolted the problem, use the remove nd replce procedures to replce the dpter. If this does not fix the problem, replce the system bord. Possible Cuse-FRUs or other cuse: v High-speed SAS dpter v System bord 711 A SAS dpter hs filed. Explntion: Dt: A SAS dpter hs filed. v A number indicting the dpter loction. The loction indictes n dpter slot. See the node description for the definition of the dpter slot loctions. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Use the remove nd replce procedures to replce the dpter. If this does not fix the problem, replce the system bord. Possible Cuse-FRUs or other cuse: v High-speed SAS dpter v System bord 712 A SAS dpter hs PCI error. Explntion: Dt: A SAS dpter hs PCI error. v A number indicting the dpter loction. The loction indictes n dpter slot. See the node description for the definition of the dpter slot loctions. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Replce the dpter using the remove nd replce procedures. If this does not fix the problem, replce the system bord. Possible Cuse-FRUs or other cuse: v v SAS dpter System bord 713 A SAS dpter is degrded. Explntion: Dt: A SAS dpter is degrded. v A number indicting the dpter loction. The loction indictes n dpter slot. See the node description for the definition of the dpter slot loctions. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Use the remove nd replce procedures to replce the dpter. If this does not fix the problem, replce the system bord. Possible Cuse-FRUs or other cuse: v High-speed SAS dpter v System bord 715 Fewer SAS host ports opertionl Explntion: A SAS port tht ws previously opertionl is no longer opertionl. The physicl link is down. Dt: Three numeric vlues re listed: v The ID of the first unexpected inctive port. This ID is deciml number. v The ports tht re expected to be ctive, which is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is expected to be ctive. v The ports tht re ctully ctive, which is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is ctive. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Possibilities: v If the port hs been intentionlly disconnected, use the mngement GUI recommended ction for the service error code nd cknowledge the intended chnge. v Check tht the SAS cble is connected t both ends nd is not dmged. If necessry, replce the cble. 166 SAN Volume Controller: Troubleshooting Guide

181 v Check the switch port or other device tht the cble is connected to is powered nd enbled in comptible mode. Rectify ny issue. The device service interfce might indicte the issue. v Use the remove nd replce procedures to replce the dpter. Possible Cuse-FRUs or other cuse: v SAS cble v SAS dpter 720 Ethernet dpter tht ws previously present hs not been detected. Explntion: An Ethernet dpter tht ws previously present hs not been detected. The dpter might not be correctly instlled or it might hve filed. Dt: v A number indicting the dpter loction. The loction indictes n dpter slot. See the node description for the definition of the dpter slot loctions. If the loction is 0, the dpter is integrted into the system bord or directly connected to it, tht is, not in PCI express expnsion slot. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. If the dpter loction is 0, use the remove nd replce procedures to replce the Ethernet edge bord, if there is one, or the system bord. 3. If the loction is not 0, there re number of possibilities:. Use the remove nd replce procedures to remove nd open the node nd check tht the dpter is fully instlled. b. If the previous steps hve not locted nd isolted the problem, use the remove nd replce procedures to replce the dpter. If this does not fix the problem, replce the system bord. Possible Cuse-FRUs or other cuse: v Ethernet dpter v System bord 721 An Ethernet dpter hs filed. Explntion: Dt: An Ethernet dpter filed. v A number indicting the dpter loction. The loction indictes n dpter slot. See the node description for the definition of the dpter slot loctions. If the loction is 0, the dpter integrted into the system bord is being reported. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. If the dpter loction is 0, use the remove nd replce procedures to replce the system bord. 3. If the dpter loction is not 0, use the remove nd replce procedures to replce the dpter. If this does not fix the problem, replce the system bord. Possible Cuse FRUs or other cuse: v Ethernet dpter v System bord 722 An Ethernet dpter hs PCI error. Explntion: Dt: An Ethernet dpter hs PCI error. v A number indicting the dpter loction. The loction indictes n dpter slot. See the node description for the definition of the dpter slot loctions. If the loction is 0, the dpter integrted into the system bord is being reported. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. If the dpter loction is 0, use the remove nd replce procedures to replce the system bord. 3. If the dpter loction is not 0, use the remove nd replce procedures to replce the dpter. If this does not fix the problem, replce the system bord. Possible Cuse FRUs or other cuse: v Ethernet dpter v System bord 723 An Ethernet dpter is degrded. Explntion: Dt: An Ethernet dpter is degrded. v A number indicting the dpter loction. The loction indictes n dpter slot. See the node description for the definition of the dpter slot loctions. If the loction is 0, the dpter integrted into the system bord is being reported. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. If the dpter loction is 0, use the remove nd replce procedures to replce the system bord. Chpter 6. Dignosing problems 167

182 If the dpter loction is not 0, use the remove nd replce procedures to replce the dpter. If this does not fix the problem, replce the system bord. Possible Cuse FRUs or other cuse: v Ethernet dpter v System bord 724 Fewer Ethernet ports ctive. Explntion: An Ethernet port tht ws previously opertionl is no longer opertionl. The physicl link is down. Dt: Three numeric vlues re listed: v The ID of the first unexpected inctive port. This is deciml number. v The ports tht re expected to be ctive. This is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is expected to be ctive. v The ports tht re ctully ctive. This is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is ctive. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Possibilities:. If the port hs been intentionlly disconnected, use the mngement GUI recommended ction for the service error code nd cknowledge the intended chnge. b. Mke sure the Ethernet cble is connected t both ends nd is undmged. If necessry, replce the cble. c. Check tht the switch port, or other device the cble is connected to, is powered nd enbled in comptible mode. Rectify ny issue. The device service interfce might indicte the issue. d. If this is 1 Gbps port, use the remove nd replce procedures to replce the SFP trnsceiver in the system nd the SFP trnsceiver in the connected switch or device. e. Replce the dpter or the system bord (depending on the port loction) by using the remove nd replce procedures. Possible Cuse FRUs or other cuse: v Ethernet cble v Ethernet SFP trnsceiver v Ethernet dpter v System bord 730 The bus dpter hs not been detected. Explntion: The bus dpter tht connects the cnister to the enclosure midplne hs not been detected. This node error does not, in itself, stop the node cnister becoming ctive in the system. However, the bus might be being used to communicte between the node cnisters in clustered system. Therefore, it is possible tht this node error indictes the reson why the criticl node error 550 A cluster cnnot be formed becuse of lck of cluster resources is reported on the node cnister. Dt: v A number indicting the dpter loction. Loction 0 indictes tht the dpter integrted into the system bord is being reported. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. As the dpter is locted on the system bord, replce the node cnister using the remove nd replce procedures. Possible Cuse-FRUs or other cuse: v Node cnister 731 The bus dpter hs filed. Explntion: The bus dpter tht connects the cnister to the enclosure midplne hs filed. This node error does not, in itself, stop the node cnister becoming ctive in the system. However, the bus might be being used to communicte between the node cnisters in clustered system. Therefore, it is possible tht this node error indictes the reson why the criticl node error 550 A cluster cnnot be formed becuse of lck of cluster resources is reported on the node cnister. Dt: v A number indicting the dpter loction. Loction 0 indictes tht the dpter integrted into the system bord is being reported. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. As the dpter is locted on the system bord, replce the node cnister using the remove nd replce procedures. Possible Cuse-FRUs or other cuse: v Node cnister 168 SAN Volume Controller: Troubleshooting Guide

183 The bus dpter hs PCI error. Explntion: The bus dpter tht connects the cnister to the enclosure midplne hs PCI error. This node error does not, in itself, stop the node cnister becoming ctive in the system. However, the bus might be being used to communicte between the node cnisters in clustered system; therefore it is possible tht this node error indictes the reson why the criticl node error 550 A cluster cnnot be formed becuse of lck of cluster resources is reported on the node cnister. Dt: v A number indicting the dpter loction. Loction 0 indictes tht the dpter integrted into the system bord is being reported. 1. If possible, this noncriticl node error should be serviced using the mngement GUI nd running the recommended ctions for the service error code. 2. As the dpter is locted on the system bord, replce the node cnister using the remove nd replce procedures. Possible Cuse-FRUs or other cuse: v Node cnister 733 The bus dpter degrded. Explntion: The bus dpter tht connects the cnister to the enclosure midplne is degrded. This node error does not, in itself, stop the node cnister from becoming ctive in the system. However, the bus might be being used to communicte between the node cnisters in clustered system. Therefore, it is possible tht this node error indictes the reson why the criticl node error 550 A cluster cnnot be formed becuse of lck of cluster resources is reported on the node cnister. Dt: v A number indicting the dpter loction. Loction 0 indictes tht the dpter integrted into the system bord is being reported. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. As the dpter is locted on the system bord, replce the node cnister using the remove nd replce procedures. Possible Cuse-FRUs or other cuse: v Node cnister 734 Fewer bus ports. Explntion: One or more PCI bus ports tht hve previously been ctive re now inctive. This condition hs existed for over one minute. Tht is, the internode link hs been down t the protocol level. This could be link issue but is more likely cused by the prtner node unexpectedly filing to respond. Dt: Three numeric vlues re listed: v The ID of the first unexpected inctive port. This is deciml number. v The ports tht re expected to be ctive. This is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is expected to be ctive. v The ports tht re ctully ctive. This is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is ctive. 1. If possible, this noncriticl node error should be serviced using the mngement GUI nd running the recommended ctions for the service error code. 2. Follow the procedure for getting node cnister nd clustered-system informtion nd determine the stte of the prtner node cnister in the enclosure. Fix ny errors reported on the prtner node cnister. 3. Use the remove nd replce procedures to replce the enclosure. Possible Cuse-FRUs or other cuse: v Node cnister v Enclosure midplne 736 The temperture of device on the system bord is greter thn or equl to the wrning threshold. Explntion: The temperture of device on the system bord is greter thn or equl to the wrning threshold. Check for externl nd internl ir flow blockges or dmge. 1. Remove the top of the mchine cse nd check for missing bffles, dmged het sinks, or internl blockges. 2. If problem persists, replce the system bord. v System bord Chpter 6. Dignosing problems 169

184 The temperture of power supply is greter thn or equl to the wrning or criticl threshold. Explntion: The temperture of power supply is greter thn or equl to the wrning or criticl threshold. Check for externl nd internl ir flow blockges or dmge. 1. Remove the top of the mchine cse nd check for missing bffles, dmged het sinks, or internl blockges. 2. If the problem persists, replce the power supply. v Power supply 738 The temperture of PCI riser crd is greter thn or equl to the wrning threshold. Explntion: The temperture of PCI riser crd is greter thn or equl to the wrning threshold. Check for externl nd internl ir flow blockges or dmge. 1. Remove the top of the mchine cse nd check for missing PCI riser crd 2, missing bffles, or internl blockges. 2. Check ll of the PCI crds plugged into the riser tht is identified by the extr dt to find if ny re fulty, nd replce s necessry. 3. If the problem persists, replce the PCI riser. v PCI riser 740 The commnd filed becuse of wiring error described in the event log. Explntion: It is dngerous to exclude ss port while the topology is invlid, so we forbid the user from ttempting it to void ny potentil loss of dt ccess. commnd. 741 CPU missing Correct the topology, then retry the Explntion: A CPU tht ws previously present hs not been detected. The CPU might not be correctly instlled or it might hve filed. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Select one of the following ctions: v If removing the CPU ws deliberte, follow the mngement GUI recommended ctions to mrk the hrdwre chnge s intentionl. v If it is not possible to isolte the problem, use the remove nd replce procedures to replce the CPU. v Replce the system bord. 743 A boot drive is offline, missing, out of sync, or the persistent dt is not usble. Explntion: A boot drive is offline, missing, out of sync, or the persistent dt is not usble. Look t boot drive view to determine the problem. 1. If slot sttus is out of sync, then re-sync the boot drives by running the commnd stsk chbootdrive. 2. If slot sttus is missing, then put the originl drive bck in this slot or instll FRU drive. 3. If slot sttus is filed, then replce the drive. v Boot drive 744 A boot drive is in the wrong loction. Explntion: A boot drive is in the wrong slot or comes from nother node. Look t boot drive view to determine the problem. 1. Replce the boot drive with the correct drive nd put this drive bck in the node tht it cme from. 2. Sync the boot drive if you choose to use it in this node. v None 745 A boot drive is in n unsupported slot. Explntion: A boot drive is in n unsupported slot. This mens tht t lest one of the first two drives re online nd t lest one invlid slot (3-8) is occupied. Look t boot drive view to determine which invlid slot(s) re occupied nd remove the drive(s). v None 746 Technicin port connection invlid. Explntion: The code hs detected more thn one MAC ddress though the connection, or the DHCP hs given out more thn one ddress. The code thus believes there is switch ttched. 170 SAN Volume Controller: Troubleshooting Guide

185 Plug cble from the technicin port to switch, nd plug 2 or more mchines into tht switch. They must hve IP ddresses in the rnge Request DHCP lese to trigger the detection. 747 The Technicin port is being used. Explntion: used The Technicin port is ctive nd being No service ction is required. Use the worksttion to configure the node. 748 The technicin port is enbled. Explntion: The technicin port is enbled initilly for esy configurtion, nd then disbled, so tht the port cn be used for iscsi connection. When ll connectivity to the node fils, the technicin port cn be reenbled for emergency use but must not remin enbled. This event is to remind you to disble the technicin port. While the technicin port is enbled, do not connect it to the LAN/SAN. this problem. Complete the following step to resolve 1. Turn off technicin port by using the following CLI commnd: stsk chserviceip -techport disble v N/A 750 Compression ccelertor missing Explntion: A compression dpter tht ws previously present ws not detected. 1. Use the svcinfo lsnodehw commnd to review the hrdwre on the node indicted by this event. 2. If ll missing nd chnged hrdwre is s expected, use the chnodehw commnd to ccept the current node hrdwre configurtion. 3. Otherwise, complete ech of the following steps in turn until the event utomticlly mrks s fixed:. Shut down the node. Ensure the correct hrdwre is instlled in its correct loction. Reset ny hrdwre tht re indicted s missing. Bring the node bck online. Go bck to step 1. b. Shut down the node. Replce ny hrdwre tht is indicted s missing. Bring the node bck online. Go bck to step 1. c. Shut down the node. Replce the system bord or cnister. Bring the node bck online. Go bck to step Compression ccelertor filed Explntion: 1. Shut down the node. A compression dpter hs filed. 2. Replce the dpter in the slot indicted by the event log with new dpter of the sme type. Note: For the Storwize V7000 Gen2, the two compression crds shre the sme loction. 3. Bring the node bck online. 4. If the error does not uto-fix, shut down the node nd replce the system bord or cnister, then bring the node bck online. 766 CMOS bttery filure. Explntion: CMOS bttery filure. v CMOS bttery Replce the CMOS bttery. 768 Ambient temperture wrning. Explntion: The mbient temperture of the node is close to the point where it stops performing I/O nd enters service stte. The node is currently continuing to operte. Dt: v A text string identifying the therml sensor reporting the wrning level nd the current temperture in degrees (Celsius). 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Check the temperture of the room nd correct ny ir conditioning or ventiltion problems. 3. Check the irflow round the system to mke sure no vents re blocked. Possible Cuse-FRUs or other cuse: v None 769 CPU temperture wrning. Explntion: The temperture of the CPU within the node is close to the point where the node stops performing I/O nd enters service stte. The node is currently continuing to operte. This is most likely n mbient temperture problem, but it might be hrdwre problem. Dt: Chpter 6. Dignosing problems 171

186 v A text string identifying the therml sensor reporting the wrning level nd the current temperture in degrees (Celsius). 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Check the temperture of the room nd correct ny ir conditioning or ventiltion problems. 3. Check the irflow round the system. Ensure no vents re blocked. 4. Mke sure the node fns re opertionl. 5. If the error is still reported, replce the node s CPU. Possible Cuse FRUs or other cuse: v CPU 777 Power supply missing. Explntion: A power supply is missing. Instll power supply. v Power supply 779 Bttery is missing Explntion: The bttery is not instlled in the system. Instll the bttery. You cn power up the system without the bttery instlled. v Bttery (100%) 770 Shutdown temperture reched Explntion: The node temperture hs reched the point t which it is must shut down to protect electronics nd dt. This is most likely n mbient temperture problem, but it could be hrdwre issue. Dt: v A text string identifying the therml sensor reporting the wrning level nd the current temperture in degrees (Celsius). 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Check the temperture of the room nd correct ny ir conditioning or ventiltion problems. 3. Check the irflow round the system nd mke sure no vents re blocked. Possible Cuse-FRUs or other cuse: v CPU 775 Power supply problem. Explntion: A power supply hs fult condition. Replce the power supply. v Power supply 776 Power supply mins cble unplugged. Explntion: plugged in. A power supply mins cble is not v None Plug in power supply mins cble. 780 Bttery hs filed Explntion: 1. The bttery hs filed. 2. The bttery is pst the end of its useful life. 3. The bttery filed to provide power on previous occsion nd is therefore, regrded s unfit for its purpose. Replce the bttery. v Bttery (100%) 781 Bttery is below the minimum operting temperture Explntion: The bttery cnnot perform the required function becuse it is below the minimum operting temperture. This error is reported only if the bttery subsystem cnnot provide full protection. An inbility to chrge is not reported if the combined chrge vilble from ll instlled btteries cn provide full protection t the current chrge levels. No service ction required, use the console to mnge the node. Wit for the bttery to wrm up. 782 Bttery is bove the mximum operting temperture Explntion: The bttery cnnot perform the required function becuse it is bove the mximum operting temperture. This error is reported only if the bttery subsystem cnnot provide full protection. An inbility to chrge is not reported if the combined 172 SAN Volume Controller: Troubleshooting Guide

187 chrge vilble from ll instlled btteries cn provide full protection t the current chrge levels. No service ction required, use the console to mnge the node. Wit for the bttery to cool down. 783 Bttery communictions error Explntion: A bttery is instlled, but communictions vi I2C re not functioning. This might be either fult in the bttery unit or fult in the bttery bckplne. No service ction required, use the console to mnge the node. Replce the bttery. If the problem persists, conduct the corrective service procedure described in 1109 on pge Bttery is nering end of life Explntion: The bttery is ner the end of its useful life. You should replce it t the erliest convenient opportunity. This might be either fult in the bttery unit or fult in the bttery bckplne. No service ction required, use the console to mnge the node. Replce the bttery. 785 Bttery cpcity is reduced becuse of cell imblnce Explntion: The chrge levels of the cells within the bttery pck re out of blnce. Some cells become fully chrged before others, which cuses chrging to terminte erly, before the entire bttery pck is fully chrged. Ending rechrging premturely effectively reduces the vilble cpcity of the pck. Circuitry within the bttery pck corrects such errors normlly, but cn tke tens of hours to complete. If this error is not fixed fter 24 hours, or if the error reoccurs fter it fixes itself, the error is likely indictive of problem in the bttery cells. In such cse, replce the bttery pck. No service ction required, use the console to mnge the node. Wit for the cells to blnce. 786 Bttery VPD checksum error Explntion: The checksum on the vitl product dt (VPD) stored in the bttery EEPROM is incorrect. No service ction required, use the console to mnge the node. Replce the bttery. 787 Bttery is t hrdwre revision level not supported by the current code level Explntion: The bttery currently instlled is t hrdwre revision level tht is not supported by the current code level. No service ction required, use the console to mnge the node. Either updte the code level to one tht supports the currently instlled bttery or replce the bttery with one tht is supported by the current code level. 803 Fibre Chnnel dpter not working Explntion: A problem hs been detected on the node s Fibre Chnnel (FC) dpter. Follow troubleshooting procedures to fix the hrdwre. 806 Node IP missing Explntion: When the sinfo lsnodeip commnd ws run, no IP ddresses were found for the node. This error is cused if node IP ddresses were not specified during instlltion or ll node IP ddresses were deleted. 1. Verify tht the node IP ddresses re missing by running the sinfo lsnodeip commnd. 2. Run the stsk chnodeip commnd to set node IP ddresses. Configure t lest two node IP ddresses. 820 Cnister type is incomptible with enclosure model Explntion: The node cnister hs detected tht it hs hrdwre type tht is not comptible with the control enclosure MTM, such s node cnister with hrdwre type 500 in n enclosure with MTM This is n expected condition when control enclosure is being upgrded to different type of node cnister. 1. Check tht ll the upgrde instructions hve been followed completely. 2. Use the mngement GUI to run the recommended ctions for the ssocited service error code. Chpter 6. Dignosing problems 173

188 Encryption key required. Explntion: It is necessry to provide n encryption key before the system cn become fully opertionl. This node error occurs when system with encryption enbled is restrted without n encryption key vilble. Insert USB flsh drive contining vlid key into one of the node cnisters. 831 Encryption key is not vlid. Explntion: It is necessry to provide n encryption key before the system cn become fully opertionl. This node error occurs when the encryption key identified is invlid. A file with the correct nme ws found but the key in the file is corrupt. This node error will cler once the USB flsh drive contining the invlid key is removed. port. Remove the USB flsh drive from the 832 Encryption key file not found. Explntion: A USB flsh drive contining n encryption key is present but the expected file cnnot be locted. This cn occur if key for different system or n old key for this system hs been provided. Additionlly, other user-creted files tht mtch the key file nme formt cn cuse this error if the USB flsh drive does not contin the expected key. This node error will cler when the USB flsh drive identified hs been removed. port. Remove the USB flsh drive from the 833 Unsupported USB device. Explntion: An unsupported device hs been connected to USB port. Only USB flsh drives re supported nd this node error will be rised if nother type of device is connected to USB port. Remove the unsupported device. 836 Encryption key required Explntion: It is necessry to provide n encryption key before the system cn become fully opertionl. This error occurs when system with encryption enbled is restrted without n encryption key vilble. Connect key server tht contins the current key for this system to one or more of the nodes. 840 Unsupported hrdwre chnge detected. Explntion: A chnge hs been detected to the hrdwre configurtion for this node. The new configurtion is not supported by the node softwre. User ction is required to repir the hrdwre or updte the softwre. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Follow the procedure for getting node nd clustered-system informtion. A chnge to the hrdwre configurtion is expected. 3. If the hrdwre configurtion is unexpectedly reduced, mke sure the component hs not been unseted. Hrdwre replcement might be necessry. 4. If new hrdwre component is shown s unsupported, check the softwre version required to support the hrdwre component. Updte the softwre to version tht supports the hrdwre. If the hrdwre detected does not mtch the expected configurtion, replce the hrdwre component tht is reported incorrectly. Possible Cuse-FRUs or other cuse: v One of the optionl hrdwre components might require replcement 841 Supported hrdwre chnge detected. Explntion: A chnge hs been detected in the node hrdwre configurtion. The new configurtion is supported by the node softwre. The new configurtion does not become ctive until it is ctivted. A node configurtion is remembered only while it is ctive in system. This node error is therefore resolved using the mngement GUI. Use the mngement GUI to run the recommended ctions for the ssocited service error code. Use the directed mintennce to ccept or reject the new configurtion. Note: When you updte your system softwre to from previous version on system where you hve lredy instlled more thn 64 GB of RAM, ll nodes return from the updte with n error code of 841. Version lloctes memory in different wy thn previous versions, so the RAM must be "ccepted" gin. To resolve the error, complete the following steps: 1. On single node, run the svctsk chnodehw commnd. Do not run the commnd on more thn one node t time. 2. Wit for the node to restrt nd return without the error. 174 SAN Volume Controller: Troubleshooting Guide

189 Wit n dditionl 30 minutes for multipth drives to recover on the host. 4. Repet this process for ech node individully until you cler the error on ll nodes. 842 Fibre Chnnel IO port mpping filed Explntion: A Fibre Chnnel or Fibre Chnnel over Ethernet port is instlled but is not included in the Fibre Chnnel I/O port mpping, nd so the port cnnot be used for Fibre Chnnel I/O. This error is rised in one of the following situtions: v A node hrdwre instlltion v A chnge of I/O dpters v The ppliction of n incorrect Fibre Chnnel port mp These tsks re normlly performed by service representtives. Your service representtive cn use the Service Assistnt to modify the Fibre Chnnel I/O port mppings to include ll the instlled ports cpble of Fibre Chnnel I/O. The following commnd is used: stsk chvpd -fcportmp This error indictes problem with the Fibre Chnnel fbric configurtion. It is resolved by reconfiguring the FC switch: 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Rezone the FC network so only the ports the node needs to connect to re visible to it. Possible Cuse-FRUs or other cuse: v None 870 Too mny cluster cretions mde on node Explntion: this node. Dt: v None Too mny systems hve been creted on 1. Try to crete the clustered system on different node. 2. Contct your service representtive. 850 The cnister bttery is reching the end of its useful life. Explntion: The cnister bttery is reching the end of its useful life. It should be replced within week of the node error first being reported. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Replce the node cnister bttery by using the remove nd replce procedures. Possible Cuse-FRUs or other cuse: v Cnister bttery 860 Fibre Chnnel network fbric is too big. Explntion: The number of Fibre Chnnel (FC) logins mde to the node exceeds the llowed limit. The node continues to operte, but only communictes with the logins mde before the limit ws reched. The order in which other devices log into the node cnnot be determined, so the node s FC connectivity might vry fter ech restrt. The connection might be with host systems, other storge systems, or with other nodes. This error might be the reson the node is unble to prticipte in system. The number of llowed logins per node is Dt: v None 871 Filed to increment cluster ID Explntion: The clustered system crete option filed becuse the clustered system, which is stored in the service controller, could not be updted. Dt: v None 1. Try to crete the clustered system on different node. 2. Contct your service representtive. 875 Request to cluster rejected. Explntion: A cndidte node could not be dded to the clustered system. The node contins hrdwre or firmwre tht is not supported in the clustered system. Dt: This node error nd extr dt is viewble through sinfo lsservicesttus on the cndidte node only. The extr dt lists full set of feture codes tht re required by the node to run in the clustered system. v Choose different cndidte tht is comptible with the clustered system. v Updte the clustered system to code tht is supported by ll components. v Do not dd cndidte to the clustered system. Chpter 6. Dignosing problems 175

190 v Where pplicble, remove nd replce the hrdwre tht is preventing the cndidte from joining the clustered system. Possible Cuse FRUs or other cuse. For informtion on feture codes vilble, see the SAN Volume Controller nd Storwize fmily Chrcteristic Interoperbility Mtrix on the support website: Attempting recovery fter loss of stte dt. Explntion: During strtup, the node cnnot red its stte dt. It reports this error while witing to be dded bck into clustered system. If the node is not dded bck into clustered system within set time, node error 578 is reported. 1. Allow time for recovery. No further ction is required. 2. Keep monitoring in cse the error chnges to error code Too mny Fibre Chnnel logins between nodes. Explntion: The system hs determined tht the user hs zoned the fbric such tht this node hs received more thn 16 unmsked logins originting from nother node or node cnister - this cn be ny non service mode node or cnister in the locl cluster or in remote cluster with prtnership. An unmsked login is from port whose corresponding bit in the FC port msk is '1'. If the error is rised ginst node in the locl cluster, then it is the locl FC port msk tht is pplied. If the error is rised ginst node in remote cluster, then it is the prtner FC port msks from both clusters tht pply. More thn 16 logins is not supported configurtion s it increses internode communiction nd cn ffect bndwidth nd performnce. For exmple, if node A hs 8 ports nd node B hs 8 ports where the nodes re in different clusters, if node A hs prtner FC port msk of nd node B hs prtner FC port msk of there re 4 unmsked logins possible (1,7 1,8 2,7 2,8). Fbric zoning my be used to reduce this mount further, i.e. if node B port 8 is removed from the zone there re only 2 (1,7 nd 2,7). The combintion of msks nd zoning must leve 16 or fewer possible logins. Note: This count includes both FC nd Fibre Chnnel over Ethernet (FCoE) logins. The log-in count will not include msked ports. When this event is logged. the cluster id nd node id of the first node whose logins exceed this limit on the locl node will be reported, s well s the WWNN of sid node. If logins chnge, the error is utomticlly fixed nd nother error is logged if pproprite (this my or my not choose the sme node to report in the sense dt if the sme node is still over the mximum llowed). Dt Text string showing v WWNN of the other node v Cluster ID of other node v Arbitrry node ID of one other node tht is logged into this node. (node ID s it ppers in lsnode) The error is resolved by either re-configuring the system to chnge which type of connection is llowed on port, or by chnging the SAN fbric configurtion so ports re not in the sme zone. A combintion of both options my be used. The system reconfigurtion is to chnge the Fibre Chnnel ports msk to reduce which ports cn be used for internode communiction. The locl Fibre Chnnel port msk should be modified if the cluster id reported mtches the cluster id of the node logging the error. The prtner Fibre Chnnel port msk should be modified if the cluster id reported does not mtch the cluster id of the node logging the error. The prtner Fibre Chnnel port msk my need to be chnged for one or both clusters. SAN fbric configurtion is set using the switch configurtion utilities. Use the lsfbric commnd to view the current number of logins between nodes. Possible Cuse-FRUs or other cuse: v None Service error code Filed to crete remote IP connection. Explntion: Despite request to crete remote IP prtnership port connection, the ction hs filed or timed out. Fix the remote IP link so tht trffic cn flow correctly. Once the connection is mde, the error will uto-correct. 920 Unble to perform cluster recovery becuse of lck of cluster resources. Explntion: The node is looking for quorum of resources which lso require cluster recovery. Contct IBM technicl support. 176 SAN Volume Controller: Troubleshooting Guide

191 Unble to perform cluster recovery becuse of lck of cluster resources. Explntion: The node does not hve sufficient connectivity to other nodes or quorum device to form cluster. If disster hs occurred nd the nodes t the other site cnnot be recovered, then it is possible to llow the nodes t the surviving site to form system using locl storge. Repir the fbric or quorum device to estblish connectivity. As lst resort when the nodes t the other site cnnot be recovered, then it is possible to llow the nodes t the surviving site to form system using locl site storge s described below: To void dt corruption ensure tht ll host servers tht were previously ccessing the system hve hd ll volumes un-mounted or hve been rebooted. Ensure tht the nodes t the other site re not opertionl nd re unble to form system in the future. After invoking this commnd full re-synchroniztion of ll mirrored volumes will be performed when the other site is recovered. This is likely to tke mny hours or dys to complete. Contct IBM support personnel if you re unsure. Note: Before continuing confirm tht you hve tken the following ctions - filure to perform these ctions cn led to dt corruption tht will be undetected by the system but will ffect host pplictions. 1. All host servers tht were previously ccessing the system hve hd ll volumes un-mounted or hve been rebooted. 2. Ensure tht the nodes t the other site re not operting s system nd ctions hve been tken to prevent them from forming system in the future. After these ctions hve been tken the stsk overridequorum cn be used to llow the nodes t the surviving site to form system using locl storge. 950 Specil updte mode. Explntion: Specil updte mode. None. 990 Cluster recovery hs filed. Explntion: Cluster recovery hs filed. Contct IBM technicl support Automtic cluster recovery hs run. Explntion: blocked. All cluster configurtion commnds re Cll your softwre support center. Cution: You cn unblock the configurtion commnds through the cluster GUI, but you must first consult with your softwre support to void corrupting your cluster configurtion. v None 1002 Event log full. Explntion: the strt MAP. Event log full. v Unfixed errors in the log. To fix the errors in the event log, go to 1007 Cnister to cnister communiction error. Explntion: A cnister to cnister communiction error cn pper when one cnister cnnot communicte with the other. Reset the pssive cnister, nd then try reseting the ctive cnister. If neither resolve the lert, try replcing the pssive cnister, nd then the other cnister. A cnister cn be sfely reseted or replced while the system is in production. Mke sure tht the other cnister is the ctive node before removing this cnister. It is preferble tht this cnister shuts down completely before removing it, but it is not required. 1. Reset the pssive cnister ( filover is not required). 2. Reset the second cnister ( filover is required). 3. If necessry, replce the pssive cnister ( filover is not required). 4. If necessry, replce the ctive cnister ( filover is required). If second new cnister is not vilble, the previously removed cnister cn be used, s it pprently is not t fult. 5. An enclosure replcement might be necessry. Contct IBM support. Cnister (95%) Enclosure (5%) 1009 DIMMs re incorrectly instlled. Explntion: DIMMs re incorrectly instlled. Ensure tht memory DIMMs re spred evenly cross ll memory chnnels. 1. Shut down the node. Chpter 6. Dignosing problems 177

192 Ensure tht memory DIMMs re spred evenly cross ll memory chnnels. 3. Restrt the node. 4. If the error persists, replce system bord. v None 1011 Fibre Chnnel dpter (4 port) in slot 1 is missing. Explntion: Fibre Chnnel dpter (4 port) in slot 1 is missing. 1. Exchnge the FRUs for new FRUs. 2. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired s fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem. 3. Go to repir verifiction MAP Fibre Chnnel dpter (4-port) in slot 1 PCI fult. Explntion: Fibre Chnnel dpter (4-port) in slot 1 PCI fult. 1. Exchnge FRUs for new FRUs. 2. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired s fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem. 3. Go to repir verifiction MAP Fibre Chnnel dpter in slot 1 is missing. Explntion: missing. The Fibre Chnnel dpter in slot 1 is 1. In the sequence tht is shown in the log, replce ny filing FRUs with new FRUs. 2. Check node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 3. Go to the repir verifiction MAP. Possible Cuse, FRUs, or other: v N/A 1015 Fibre Chnnel dpter in slot 2 is missing. Explntion: missing. Fibre Chnnel dpter in slot 2 is 1. In the sequence tht is shown in the log, replce ny filing FRUs for new FRUs. 2. Check the node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny node does not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 3. Go to the repir verifiction MAP. Possible Cuse, FRUs, or other: v N/A 1016 Fibre Chnnel dpter (4 port) in slot 2 is missing. Explntion: The four-port Fibre Chnnel dpter in PCI slot 2 is missing. 1. In the sequence tht is shown in the log, replce ny filing FRUs with new FRUs. 2. Check node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 3. Go to the repir verifiction MAP. Possible Cuse, FRUs, or other: v Fibre Chnnel host bus dpter (90%) v PCI riser crd (5%) v Other (5%) 1017 Fibre Chnnel dpter in slot 1 PCI bus error. Explntion: The Fibre Chnnel dpter in PCI slot 1 is filing with PCI bus error. 1. In the sequence tht is shown in the log, replce ny filing FRUs with new FRUs. 2. Check node sttus: v If ll nodes show sttus of online, mrk the error s fixed. 178 SAN Volume Controller: Troubleshooting Guide

193 v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 3. Go to the repir verifiction MAP. Possible Cuse, FRUs, or other: v Fibre Chnnel host bus dpter (80%) v PCI riser crd (10%) v Other (10%) 1018 Fibre Chnnel dpter in slot 2 PCI fult. Explntion: The Fibre Chnnel dpter in slot 2 is filing with PCI fult. 1. In the sequence tht is shown in the log, replce ny filing FRUs with new FRUs. 2. Check node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 3. Go to the repir verifiction MAP. Possible Cuse, FRUs, or other: v Dul port Fibre Chnnel host bus dpter - full height (80%) v PCI riser crd (10%) v Other (10%) 1019 Fibre Chnnel dpter (four-port) in slot 2 PCI fult. Explntion: The four-port Fibre Chnnel dpter in slot 2 is filing with PCI fult. 1. In the sequence tht is shown in the log, replce ny filing FRUs with new FRUs. 2. Check node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 3. Go to the repir verifiction MAP. Possible Cuse, FRUs, or other: v Four-port Fibre Chnnel host bus dpter (80%) v PCI Express riser crd (10%) v Other (10%) 1020 The system bord service processor hs filed. Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 522. See the detils of node error 522 for more informtion. See node error Incorrect enclosure Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 500. See the detils of node error 500 for more informtion. See node error The detected memory size does not mtch the expected memory size. Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 510. See the detils of node error 510 for more informtion. See node error CPU is broken or missing. Explntion: CPU is broken or missing. Review the node hrdwre using the svcinfo lsnodehw commnd on the node indicted by this event. 1. Shutdown the node. Replce the CPU tht is broken s indicted by the light pth nd event dt. 2. If error persist, replce system bord. Note: Intentionl removl is not permitted on clustered node. To use the node with only one processor, you must rmnode, nd then redd. Otherwise, shutdown the node nd replce the processor tht ws removed. v CPU (80%) v System bord (20%) 1025 Processor missing Explntion: The system ssembly is filing. 1. Go to the light pth dignostic MAP nd complete the light pth dignostic procedures. 2. If the light pth dignostic procedure isoltes the FRU, mrk this error s fixed. Then, go to the repir verifiction MAP. 3. If you replce FRU, but it does not correct the problem, ensure tht the FRU is instlled correctly. Then, go to the next step. Chpter 6. Dignosing problems 179

194 Replce the system bord s indicted in the Possible Cuse list. 5. Check the node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 6. Go to the repir verifiction MAP System bord device problem. Explntion: System bord device problem. The ction depends on the extr dt tht is provided with the node error nd the light pth dignostics. v Vrible 1027 Unble to updte BIOS settings. Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 524. See the detils of node error 524 for more informtion. See node error System bord service processor filed. Explntion: 1. Shut down the node. System bord service processor filed. Complete the following steps: 2. Remove the min power cble. 3. Wit for the lights to stop flshing. 4. Plug in the power cble. 5. Wit for node to boot. 6. If the node still reports the error, replce system bord. v System bord 1029 Enclosure VPD is unvilble or invlid. Explntion: Enclosure VPD is unvilble or invlid. Overwrite the enclosure VPD or replce the power interposer bord. PIB crd (10%) Other: No FRU (90%) 1030 The internl disk of node hs filed. Explntion: An error hs occurred while ttempting to red or write dt to the internl disk of one of the nodes in the cluster. The disk hs filed. Determine which node's internl disk hs filed using the node informtion in the error. Replce the FRUs in the order shown. Mrk the error s fixed Node Cnister (100%) v disk drive (50%) v Disk controller (30%) v Disk bckplne (10%) v Disk signl cble (8%) v Disk power cble (1%) v System bord (1%) 1031 Node cnister loction unknown. Explntion: Node cnister loction unknown. Complete the following steps to resolve this problem. 1. List ll enclosure cnisters for ll control enclosures. Look for n online cnister tht does not hve node ID ssocited with it. This cnister is the one with the problem. 2. Unplug the SAS cble from port 2 of the cnister tht is identified in step Run the commnd lsenclosurecnister, nd see whether there is node ID present. If step 2 fixes the error ( node ID is present), then something filed in one of the ttched devices. 4. Reconnect the expnsion enclosures nd see whether the system is ble to isolte the fult. 5. Reset ll the cnisters on tht strnd nd replce the cnister tht is identified in step 1 if step 4 does not fix the error. v Nothing (80%) v Cnister (20%) 1032 Fibre Chnnel dpter not working Explntion: A problem hs been detected on the node s Fibre Chnnel (FC) dpter. This node error is reported only on SAN Volume Controller 2145-CG8 or older nodes. Follow troubleshooting procedures to fix the hrdwre. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 180 SAN Volume Controller: Troubleshooting Guide

195 Possible Cuse-FRUs or other cuse: v None 1034 Cnister fult type 2 Explntion: There is cnister internl error. Reset the cnister, nd then replce the cnister if the error continues. Cnister (80%) Other: No FRU (20%) 1035 Boot drive problems Explntion: Boot drive problems Complete the following steps: 1. Look t boot drive view to determine the problems. 2. Run the commnds lsnodebootdrive / lsbootdrive to disply sttus for ech slot for users nd DMPs to dignose nd repir problems. 3. If you pln to move ny drives, shut down the node if booted yes is shown for tht drive in the boot drive view (lsbootdrive). After you move the drives, different node error will probbly be displyed for you to work on. 4. If you pln to set the seril number of the system bord, see stsk chvpd. 5. If there is still no usble persistent dt on the boot drives, then contct IBM Remote Technicl Support. v System drive 1036 The enclosure identity cnnot be red. Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 509. See the detils of node error 509 for more informtion. See node error Cnister filure, cnister replcement required Explntion: An unrecoverble cnister error hs occurred. Contct your support representtive for ssistnce in replcing the cnister. Replce the cnister. A cnister cn be sfely replced while the system is in production. Mke sure tht the other cnister is the ctive node before removing the fulty cnister. It is preferble tht this cnister shut down completely before removing it, but it is not required. Possible cuse-frus or other: Interfce dpter (50%) SFP (20%) Cnister (20%) Internl interfce dpter cble (10%) 1040 Node flsh disk fult Explntion: A flsh module error occurred fter successful system strt. Note: The node tht contins the flsh module ws not rejected by the cluster. 1. Replce the FRUs. 2. Check node sttus. If ll nodes show sttus of Online, mrk the error tht you just repired s fixed. If ny nodes do not show sttus of Online, go to strt MAP. If you return to this step, contct support to resolve the problem. 3. Go to repir verifiction MAP Unexpected enclosure fult. Explntion: Unexpected enclosure fult. Use the bottom snp option in the mngement GUI. This performs the following functions: v Genertes new enclosure dumps for ll enclosures. v Genertes livedump from ll nodes in the cluster. v Runs n svc_snp dumpll. 1. Contct IBM support for further nlysis. v None 1051 Pluggble TPM filed or missing Explntion: The Trusted Pltform Module (TPM) for the system is not functioning. Importnt: Confirm tht the system is running on t lest one other node before you commence this repir. Ech node uses its TPM to securely store encryption keys on its boot drive. When the TPM or boot drive of node is replced, the node loses its encryption key, nd must be ble to join n existing system to obtin the keys. If this error occurred on the lst node in system, do not replce the TPM, boot drive, or node hrdwre until the system contins t lest one online node with vlid keys. 1. Shut down the node nd remove the node hrdwre. 2. Locte the TPM in the node hrdwre nd ensure tht it is correctly seted. Chpter 6. Dignosing problems 181

196 Reinsert the node hrdwre nd pply power to the node. 4. If the error persists, replce the TPM with one from FRU stock. 5. If the error persists, replce the system bord or the node hrdwre with one from FRU stock. You do not need to return the fulty TPM to IBM. Note: It is unlikely tht the filure of TPM cn cuse the loss of the System Mster Key (SMK): v The SMK is seled by the TPM, using its unique encryption key, nd the result is stored on the system boot drive. v The working copy of the SMK is on the RAM disk, nd so is unffected by sudden TPM filure. v If the filure hppens t boot time, the node is held in n unrecoverble error stte becuse the TPM is FRU. v The SMK is lso mirrored by the other nodes in the system. When the node with replcement TPM joins the system, it determines tht it does not hve the SMK, requests it, gets it, nd then sels with the new TPM Incorrect type of uninterruptible power supply detected Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 587. For more informtion, see the detils of node error 587. See node error Internl SAS connector filure, service ction required. Explntion: An error occurred involving n internl SAS connector. Any of the following lerts might be ssocited with this error code. v SAS connector to n enclosure secondry expnder module is not working t full cpcity v SAS connector to n enclosure secondry expnder module is offline v The stte of n enclosure secondry expnder module connector cnnot be determined Complete the following steps: 1. Enble mintennce mode for the I/O group. 2. Slide the enclosure out of the rck sufficiently to open the ccess lid. 3. Reset the ffected secondry expnder module (SEM). 4. If the error does not cler, reset the cnister on the side of the ffected SEM. 5. If the error does not cler, replce the ffected SEM. 6. If the error does not cler, replce the cnister on the side of the ffected SEM. 7. If the error does not cler, contct your service support representtive. You might need to replce the enclosure Fibre Chnnel dpter in slot 1 dpter present but filed. Explntion: The Fibre Chnnel dpter in PCI slot 1 is present but is filing. 1. In the sequence tht is shown in the log, replce ny filing FRUs with new FRUs. 2. Check node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 3. Go to the repir verifiction MAP. Possible Cuse, FRUs, or other: v Fibre Chnnel host bus dpter (100%) 1055 Fibre Chnnel dpter (4 port) in slot 1 dpter present but filed. Explntion: Fibre Chnnel dpter (4 port) in slot 1 dpter present but filed. 1. Exchnge the FRU for new FRU. 2. Check node sttus. If ll nodes show sttus of online, mrk the error tht you just repired s fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct support to resolve the problem. 3. Go to repir verifiction MAP The Fibre Chnnel dpter in slot 2 is present but is filing. Explntion: The Fibre Chnnel dpter in slot 2 is present but is filing. 1. In the sequence tht is shown in the log, replce ny filing FRUs with new FRUs. 2. Check node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 3. Go to the repir verifiction MAP. 182 SAN Volume Controller: Troubleshooting Guide

197 Possible Cuse, FRUs, or other: v N/A 1057 Fibre Chnnel dpter (four-port) in slot 2 dpter is present but filing. Explntion: The four-port Fibre Chnnel dpter in slot 2 is present but filing. 1. In the sequence tht is shown in the log, replce ny filing FRUs with new FRUs. 2. Check node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 3. Go to the repir verifiction MAP. Possible Cuse, FRUs, or other: v N/A 1059 Fibre Chnnel IO port mpping filed Explntion: A Fibre Chnnel or Fibre Chnnel over Ethernet port is instlled but is not included in the Fibre Chnnel I/O port mpping, nd so the port cnnot be used for Fibre Chnnel I/O. This error is rised in one of the following situtions: v A node hrdwre instlltion v A chnge of I/O dpters v The ppliction of n incorrect Fibre Chnnel port mp These tsks re normlly performed by service representtives. Your service representtive cn use the Service Assistnt to modify the Fibre Chnnel I/O port mppings to include ll of the instlled ports tht re cpble of Fibre Chnnel I/O. The following commnd is used: stsk chvpd -fcportmp 1060 One or more Fibre Chnnel ports on the 2072 re not opertionl. Explntion: One or more Fibre Chnnel ports on the 2072 re not opertionl. 1. Go to MAP 5600: Fibre Chnnel to isolte nd repir the problem. 2. Go to the repir verifiction MAP. v Fibre Chnnel cble (80%) v Smll Form-fctor Pluggble (SFP) connector (5%) v 4-port Fibre Chnnel host bus dpter (5%) Other: v Fibre Chnnel network fbric (10%) 1061 Fibre Chnnel ports re not opertionl. Explntion: Fibre Chnnel ports re not opertionl. An offline port cn hve mny cuses nd so it is necessry to check them ll. Strt with the esiest nd lest intrusive possibility such s resetting the Fibre Chnnel or FCoE port vi CLI commnd. Externl (cble, HBA/CNA, switch, nd so on) (75%) SFP (10%) Interfce (10%) Node (5%) 1065 One or more Fibre Chnnel ports re running t lower thn the previously sved speed. Explntion: The Fibre Chnnel ports will normlly operte t the highest speed permitted by the Fibre Chnnel switch, but this speed might be reduced if the signl qulity on the Fibre Chnnel connection is poor. The Fibre Chnnel switch could hve been set to operte t lower speed by the user, or the qulity of the Fibre Chnnel signl hs deteriorted. v Go to MAP 5600: Fibre Chnnel to resolve the problem Node Cnister (100%) v Fibre Chnnel cble (50%) v Smll Form-fctor Pluggble (SFP) connector (20%) v 4-port Fibre Chnnel host bus dpter (5%) Other: v Fibre Chnnel switch, SFP connector, or GBIC (25%) 1067 Fn fult type 1 Explntion: The fn hs filed. Replce the fn. Fn (100%) Chpter 6. Dignosing problems 183

198 Fn fult type 2 Explntion: The fn is missing. Reset the fn, nd then replce the fn if reseting the fn does not correct the error. Note: If replcing the fn does not correct the error, then the cnister will need to be replced. Fn (80%) Other: No FRU (20%) 1083 Unrecognized node error Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 562. See the detils of node error 562 for more informtion. See node error System bord device exceeded temperture threshold. Explntion: System bord device exceeded temperture threshold. Complete the following steps: 1. Check for externl ir flow blockges. 2. Remove the top of the mchine cse nd check for missing bffles, dmged het sinks, or internl blockges. 3. If problem persists, follow the service instructions for replcing the system bord FRU in question. v Vrible 1085 PCI Riser crd exceeded temperture threshold. Explntion: threshold. 1. Check irflow. PCI Riser crd exceeded temperture Complete the following steps: 2. Remove the top of the mchine cse nd check for missing bffles or internl blockges. 3. Check for fulty PCI crds nd replce s necessry. 4. If problem persists, replce PCI Riser FRU. v None 1087 Shutdown temperture threshold exceeded Explntion: exceeded. Shutdown temperture threshold Inspect the enclosure nd the enclosure environment. 1. Check environmentl temperture. 2. Ensure tht ll of the components re instlled or tht there re fillers in ech by. 3. Check tht ll of the fns re instlled nd operting properly. 4. Check for ny obstructions to irflow, proper clernce for fresh inlet ir, nd exhust ir. 5. Hndle ny specific obstructed irflow errors tht re relted to the drive, the bttery, nd the power supply unit. 6. Bring the system bck online. If the system performed hrd shutdown, the power must be removed nd repplied. Node (2%) Bttery (1%) Power supply unit (1%) Drive (1%) Other: Environment (95%) 1089 One or more fns re filing. Explntion: One or more fns re filing. For the 2145-DH8, fn hs fult condition. 1. Determine the filing fn(s) from the fn indictor on the system bord or from the text of the error dt in the log. Ech fn module contins two fns. 2. For the 2145-DH8, mechniclly stop fn or remove fn. If fn is not instlled, shut down the node, open it, nd instll the fn. If fn is instlled, replce fn FRU indicted by the FAN identifier tht is supplied in the Extr dt. 3. Exchnge the FRU for new FRU. 4. Go to repir verifiction MAP. v Fn number: Fn module position v 1 or 2 :1 v 3 or 4 :2 v 5 or 6 :3 v 7 or 8 :4 184 SAN Volume Controller: Troubleshooting Guide

199 v 9 or 10:5 v 11 or 12:6 v Fn module (100%) 1090 One or more fns (40x40x28) re filing. Explntion: One or more fns (40x40x28) re filing. 1. Determine the filing fns from the fn indictor on the system bord or from the text of the error dt in the log. 2. Verify tht the cble between the fn bckplne nd the system bord is connected: v If ll fns on the fn bckplne re filing v If no fn fult lights re illuminted 3. Exchnge the FRU for new FRU. 4. Go to repir verifiction MAP. Possible Cuse, FRUs, or other: v N/A 1091 One or more fns (40x40x56) re filing. Explntion: One or more fns (40x40x56) re filing. 1. Determine the filing fns from the fn indictor on the system bord or from the text of the error dt in the log. 2. Verify tht the cble between the fn bckplne nd the system bord is connected: v If ll fns on the fn bckplne re filing v If no fn fult lights re illuminted 3. Exchnge the FRU for new FRU. 4. Go to repir verifiction MAP. Possible Cuse, FRUs, or other: v N/A 1092 The temperture soft or hrd shutdown threshold of the 2072 hs been exceeded. The 2072 hs utomticlly powered off. Explntion: The temperture soft or hrd shutdown threshold of the 2072 hs been exceeded. The 2072 hs utomticlly powered off. 1. Ensure tht the operting environment meets specifictions. 2. Ensure tht the irflow is not obstructed. 3. Ensure tht the fns re opertionl. 4. Go to the light pth dignostic MAP nd perform the light pth dignostic procedures. 5. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired s fixed. If ny nodes do not show sttus of online, go to the strt MAP. If you return to this step, contct your support center to resolve the problem. 6. Go to the repir verifiction MAP Node Cnister (100%) v The FRU tht is indicted by the Light pth dignostics (25%) v System bord (5%) Other: System environment or irflow blockge (70%) 1093 Temperture wrning threshold exceeded Explntion: The system internl temperture sensor hs reported tht the temperture wrning threshold hs been exceeded. 1. Ensure tht the internl irflow of the node hs not been obstructed. 2. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired fixed. If ny nodes do not show sttus of online, go to the strt MAP. If you return to this step, contct your support center to resolve the problem. 3. Go to repir verifiction MAP. For the 2145-DH8 only: 1. Check for externl ir flow blockges. 2. Remove the top of the mchine cse nd check for missing bffles, dmged hetsinks, or internl blockges. 3. If the problem persists fter tking these mesures, replce the CPU ssembly FRU if 2145-DH DH8 v CPU ssembly (30%) Other: Airflow blockge (70%) Chpter 6. Dignosing problems 185

200 The mbient temperture threshold hs been exceeded. Explntion: The mbient temperture threshold hs been exceeded. 1. Check tht the room temperture is within the limits llowed. 2. Check for obstructions in the ir flow. 3. Mrk the errors s fixed. 4. Go to repir verifiction MAP. None Other: System environment (100%) 1095 Enclosure temperture hs pssed criticl threshold. Explntion: threshold. Enclosure temperture hs pssed criticl Check for externl nd internl ir flow blockges or dmge. 1. Check environmentl temperture. 2. Check for ny impednce to irflow. 3. If the enclosure hs shut down, then turn off both power switches on the enclosure nd power both bck on. v None 1096 A Power Supply Unit is missing or hs filed. Explntion: One of the two power supply units in the node is either missing or hs filed. Note: This error is reported when hot-swp power supply is removed from n ctive node, so it might be reported when fulty power supply is removed for replcement. Both the missing nd fulty conditions report this error code. Error code 1096 is reported when the power supply either cnnot be detected or reports n error. 1. Ensure tht the power supply is seted correctly nd tht the power cble is ttched correctly to both the node nd to the 2145 UPS-1U. 2. If the error hs not been utomticlly mrked fixed fter two minutes, note the sttus of the three LEDs on the bck of the power supply. 3. If the power supply error LED is off nd the AC nd DC power LEDs re both on, this is the norml condition. If the error hs not been utomticlly fixed fter two minutes, replce the system bord. 4. Follow the ction specified for the LED sttes noted in the tble below. 5. If the error hs not been utomticlly fixed fter two minutes, contct support. 6. Go to repir verifiction MAP. Error,AC,DC:Action ON,ON or OFF,ON or OFF:The power supply hs fult. Replce the power supply. OFF,OFF,OFF:There is no power detected. Ensure tht the power cble is connected t the node nd 2145 UPS-1U. If the AC LED does not light, check the sttus of the 2145 UPS-1U to which the power supply is connected. Follow MAP UPS-1U if the UPS-1U is showing no power or n error; otherwise, replce the power cble. If the AC LED still does not light, replce the power supply. OFF,OFF,ON:The power supply hs fult. Replce the power supply. OFF,ON,OFF:Ensure tht the power supply is instlled correctly. If the DC LED does not light, replce the power supply. Filed PSU: v Power supply (90%) v Power cble ssembly (5%) v System bord (5%) Missing PSU: v Power supply (19%) v System bord (1%) v Other: Power supply not correctly instlled (80%) 1097 PSU problem Explntion: One of the power supply units in the node is reporting tht no min power is detected. For the 2145-DH8, power supply hs fult condition. 1. For the 2145-DH8, replce the power supply FRU. For ll other models, complete the following steps. 2. Ensure tht the power supply is ttched correctly to both the node nd to the UPS. 186 SAN Volume Controller: Troubleshooting Guide

201 If the error is not utomticlly mrked fixed fter 2 minutes, note the sttus of the three LEDs on the bck of the power supply. 4. If the power supply error LED is off nd the AC nd DC power LEDs re both on, this stte is the norml condition. If the error is not utomticlly fixed fter 2 minutes, replce the system bord. 5. Follow the ction tht is specified for the LED sttes noted in the following list. 6. If the error is not utomticlly fixed fter 2 minutes, contct support. 7. Go to repir verifiction MAP. Error,AC,DC:Action ON,ON or OFF,ON or OFF:The power supply hs fult. Replce the power supply. OFF,OFF,OFF:There is no power detected. Ensure tht the power cble is connected t the node nd UPS. If the AC LED does not light, check whether the UPS is showing ny errors. Follow MAP UPS-1U if the UPS is showing n error; otherwise, replce the power cble. If the AC LED still does not light, replce the power supply. OFF,OFF,ON:The power supply hs fult. Replce the power supply. OFF,ON,OFF:Ensure tht the power supply is instlled correctly. If the DC LED does not light, replce the power supply. v Power cble ssembly (85%) v UPS-1U ssembly (10%) v System bord (5%) v For the 2145-DH8: power supply (100%) 1098 Enclosure temperture hs pssed wrning threshold. Explntion: Enclosure temperture hs pssed wrning threshold. Check for externl nd internl ir flow blockges or dmge. 1. Check environmentl temperture. 2. Check for ny impednce to irflow. v None 1099 Temperture exceeded wrning threshold Explntion: threshold. Temperture exceeded wrning Inspect the enclosure nd the enclosure environment. 1. Check environmentl temperture. 2. Ensure tht ll of the components re instlled or tht there re fillers in ech by. 3. Check tht ll of the fns re instlled nd operting properly. 4. Check for ny obstructions to irflow, proper clernce for fresh inlet ir, nd exhust ir. 5. Wit for the component to cool. Hrdwre component (5%) Other: Environment (95%) 1100 One of the voltges tht is monitored on the system bord is over the set threshold. Explntion: One of the voltges tht is monitored on the system bord is over the set threshold. 1. See the light pth dignostic MAP. 2. If the light pth dignostic MAP does not resolve the issue, exchnge the frme ssembly. 3. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired s fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem. 4. Go to repir verifiction MAP One of the voltges tht is monitored on the system bord is over the set threshold. Explntion: One of the voltges tht is monitored on the system bord is over the set threshold. 1. See the light pth dignostic MAP. 2. If the light pth dignostic MAP does not resolve the issue, exchnge the system bord ssembly. 3. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired s fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem. Chpter 6. Dignosing problems 187

202 Go to repir verifiction MAP. v Light pth dignostic MAP FRUs (98%) v System bord (2%) 1105 One of the voltges tht is monitored on the system bord is under the set threshold. Explntion: One of the voltges tht is monitored on the system bord is under the set threshold. 1. Check the cble connections. 2. See the light pth dignostic MAP. 3. If the light pth dignostic MAP does not resolve the issue, exchnge the frme ssembly. 4. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired s fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem. 5. Go to repir verifiction MAP One of the voltges tht is monitored on the system bord is under the set threshold. Explntion: One of the voltges tht is monitored on the system bord is under the set threshold. 1. Check the cble connections. 2. See the light pth dignostic MAP. 3. If the light pth dignostic MAP does not resolve the issue, exchnge the system bord ssembly. 4. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired s fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem. 5. Go to repir verifiction MAP. v Light pth dignostic MAP FRUs (98%) v System bord (2%) 1107 The bttery subsystem hs insufficient cpcity to sve system dt due to multiple fults. Explntion: This messge is n indiction of other problems to solve before the system cn successfully rechrge the btteries. No service ction is required for this error, but other errors must be fixed. Look t other indictions to see if the btteries cn rechrge without being put into use Bttery bckplne cbling fulty or possible bttery bckplne requires replcing. Explntion: Fulty cbling or fulty bckplne re preventing the system from full communiction with nd control of the btteries. Check the cbling to the bttery bckplne, mking sure tht ll the connectors re properly mted. Four signl cbles (EPOW, LPC, PWR_SENSE & LED) nd one power cble (which uses 12 red nd 12 blck hevy guge wires) re involved: v The EPOW cble runs to 20-pin connector t the front of the system plnr, which is the edge nerest the drive bys, ner the left side. To check tht this connector is mted properly, it is necessry to remove the plstic irflow bffle, which lifts up. A number of wires run from the sme connector to the disk bckplne locted to the left of the bttery bckplne. v The LPC cble runs to smll dpter tht is plugged into the bck of the system plnr between two PCI Express dpter cges. It is helpful to remove the left dpter cge when checking tht these connectors re mted properly. v The PWR_SENSE cble runs to 24-pin connector t the bck of the system plnr between the PSUs nd the left dpter cge. Check the connections of both femle connector (to the system plnr) nd mle connector (to the connector from the top PSU). Agin, it cn be helpful to remove the left dpter cge to check the proper mting of the connectors. v The power cble runs to the system plnr between the PSUs nd the left dpter cge. It is locted just in front of the PWR_SENSE connector. This cble hs both femle connector tht connects to the system plnr, nd mle connector tht mtes with the connector from the top PSU. Due to the bulk of this cble, cre must be tken to not disturb PWR_SENSE connections when dressing it wy in the spce between the PSUs nd the left dpter cge. v The LED cble runs to smll PCB on the front bezel. The only consequence of this cble not being mted correctly is tht the LEDs do not work. If no problems exist, replce the bttery bckplne s described in the service ction for 1109 on pge 189. You do not replce either bttery t this time. To verify tht the bttery bckplne works fter replcing it, check tht the node error is fixed. 188 SAN Volume Controller: Troubleshooting Guide

203 v Bttery bckplne (50%) 1109 Bttery or possibly bttery bckplne requires replcing. Explntion: Bttery or possibly bttery bckplne requires replcing. Complete the following steps: 1. Replce the drive by bttery. 2. Check to see whether the node error is fixed. If not, replce the bttery bckplne. 3. To verify tht the new bttery bckplne is working correctly, check tht the node error is fixed. v Drive by bttery (95%) v Bttery bckplne (5%) 1110 The power mngement bord detected voltge tht is outside of the set thresholds. Explntion: The power mngement bord detected voltge tht is outside of the set thresholds. 1. In the sequence tht is shown in the log, replce ny filing FRUs with new FRUs. 2. Check node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 3. Go to repir verifiction MAP Btteries hve insufficient chrge. Explntion: The insufficient chrge messge cn pper for vrious resons such s the bttery is chrging; the bttery is missing or hs filed; there is communiction error, or there hs been n over temperture event. This node error cn be corrected by correcting ech of the underlying bttery problems. 1. If bttery is missing, replce the bttery. 2. If bttery is filed, replce the bttery. 3. If bttery is chrging, this error should go wy when the bttery is chrged. 4. If bttery is hving communiction error (comm errror), try to reset the bttery s described in the replcement procedure. If reseting the bttery does not correct the problem, replce the bttery. 5. If bttery is too hot, the system cn be strted fter it hs cooled. Inspect the bttery for dmge fter n over-temperture event. Possible Cuse - FRUs or other: If both btteries hve errors, bttery chrging might be underwy. (No FRU) If both btteries hve errors tht do not resolve fter sufficient time to chrge, bttery chrging might be impired, such s by fulty bttery bckplne FRU. Communiction errors re often correctble by reseting the bttery or by llowing the temperture of the bttery to cool without the need to replce the bttery. (No FRU) If bttery is missing or filed, the solution is to replce the bttery FRU. Bttery (50%) Other: No FRU (50%) 1112 Enclosure bttery is missing. Explntion: Enclosure bttery is missing. Instll bttery in the missing slot. If the bttery is present in the slot, reset the bttery. Attention: Do not reset bttery unless the other bttery hs enough chrge, or dt loss might occur. Bttery (95%) Other: No FRU (5%) 1114 Enclosure bttery fult type 1 Explntion: Enclosure bttery fult type 1. Replce the bttery. Bttery (100%) Chpter 6. Dignosing problems 189

204 Enclosure Bttery fult type 4 Explntion: Enclosure Bttery fult type 4. Reset the bttery. Replce the bttery if the error continues. Note: Do not reset bttery unless the other bttery hs enough chrge, or dt loss might occur. Bttery (95%) Other: Bd connection (5%) 1120 A high speed SAS dpter is missing Explntion: This node hs detected tht high speed SAS dpter tht ws previously instlled is no longer present. If the high speed SAS dpter ws delibertely removed, mrk the error fixed. Otherwise, the high speed SAS dpter hs filed nd must be replced. In the sequence shown, exchnge the FRUs for new FRUs. Go to the repir verifiction MAP. 1. High speed SAS dpter (90%) 2. System bord (10%) 1121 A high speed SAS dpter hs filed. Explntion: A fult hs been detected on high speed SAS dpter. In the sequence shown, exchnge the FRUs for new FRUs. Go to the repir verifiction MAP. 1. High speed SAS dpter (90%) 2. System bord (10%) 1122 A high speed SAS dpter error hs occurred. Explntion: The high speed SAS dpter hs detected PCI bus error nd requires service before it cn be restrted. The high speed SAS dpter filure hs cused ll of the flsh drives tht were being ccessed through this dpter to go Offline. If this is the first time tht this error hs occurred on this node, complete the following steps: 1. Power off the node. 2. Reset the high speed SAS dpter. 3. Power on the node. 4. Submit the lsmdisk tsk nd ensure tht ll of the flsh drive mnged disks tht re locted in this node hve sttus of Online. If the sequence of ctions bove hs not resolved the problem or the error occurs gin on the sme node, complete the following steps: 1. In the sequence shown, exchnge the FRUs for new FRUs. 2. Submit the lsmdisk tsk nd ensure tht ll of the flsh drive mnged disks tht re locted in this node hve sttus of Online. 3. Go to the repir verifiction MAP. 1. High speed SAS dpter (90%) 2. System bord (10%) 1124 Power Supply Unit fult type 1 Explntion: A fult hs been detected on power supply unit (PSU). Replce the PSU. Attention: To void losing stte nd dt from the node, use the stsk strtservice commnd to put the node into service stte so tht it no longer processes I/O. Then, you cn remove nd replce the top power supply unit (PSU 2). This precution is due to limittion in the power-supply configurtion. Once the service ction is complete, run the stsk stopservice commnd to let the node rejoin the system. PSU (100%) 1125 Power Supply Unit fult type 1 Explntion: supported. version. The power supply unit (PSU) is not Replce the PSU with supported Attention: To void losing stte nd dt from the node, use the stsk strtservice commnd to put the node into service stte so tht it no longer processes I/O. Then, you cn remove nd replce the top power supply unit (PSU 2). This precution is due to limittion in the power-supply configurtion. Once the service ction is complete, run the stsk stopservice commnd to let the node rejoin the system. 190 SAN Volume Controller: Troubleshooting Guide

205 PSU (100%) 1126 Power Supply Unit fult type 2 Explntion: (PSU). A fult exists on the power supply unit 1. Reset the PSU in the enclosure. Attention: To void losing stte nd dt from the node, use the stsk strtservice commnd to put the node into service stte so tht it no longer processes I/O. Then, you cn remove nd replce the top power supply unit (PSU 2). This precution is due to limittion in the power-supply configurtion. Once the service ction is complete, run the stsk stopservice commnd to let the node rejoin the system. 2. If the fult is not resolved, replce the PSU. 1. No Prt (30%) 2. PSU (70 %) 1128 Power Supply Unit missing Explntion: The power supply unit (PSU) is not seted in the enclosure, or no PSU is instlled. 1. If no PSU is instlled, instll PSU. 2. If PSU is instlled, reset the PSU in the enclosure. Attention: To void losing stte nd dt from the node, use the stsk strtservice commnd to put the node into service stte so tht it no longer processes I/O. Then, you cn remove nd replce the top power supply unit (PSU 2). This precution is due to limittion in the power-supply configurtion. Once the service ction is complete, run the stsk stopservice commnd to let the node rejoin the system. 1. No Prt (5%) 2. PSU (95%) Reset the power supply unit in the enclosure. Power supply (100%) 1129 The node bttery is missing. Explntion: Instll new btteries to enble the node to join clustered system. Instll bttery in bttery slot 1 (on the left from the front) nd in bttery slot 2 (on the right). Leve the node running s you dd the btteries. Align ech bttery so tht the guide rils in the enclosure engge the guide ril slots on the bttery. Push the bttery firmly into the bttery by until it stops. The cm on the front of the bttery remins closed during this instlltion. To verify tht the new bttery works correctly, check tht the node error is fixed. After the node joins clustered system, use the lsnodebttery commnd to view informtion bout the bttery. v Bttery (100%) 1130 The node bttery requires replcing. Explntion: When bttery must be replced, you get this messge. The proper response is to instll new btteries. Bttery 1 is on the left (from the front), nd bttery 2 is on the right. Remove the old bttery by disengging nd pulling down the cm hndle to lever out the bttery enough to pull the bttery from the enclosure. This service procedure is intended for filed or offline bttery. To prevent losing dt from bttery tht is online, run the svctsk chnodebttery -remove -bttery bttery_id node_id. Running the commnd verifies when it is sfe to remove the bttery. Instll new btteries in bttery slot 1 nd in bttery slot 2. Leve the node running s you dd the btteries. Align ech bttery so tht the guide rils in the enclosure engge the guide ril slots on the bttery. Push the bttery firmly into the bttery by until it stops. The cm on the front of the bttery remins closed during this instlltion. To verify tht the new bttery works correctly, check tht the node error is fixed. After the node joins clustered system, use the lsnodebttery commnd to view informtion bout the bttery Bttery conditioning is required but not possible. Explntion: possible. Bttery conditioning is required but not This error cn be corrected on its own. For exmple, if the prtner node comes online, the reconditioning begins. Wit, or ddress other errors. Chpter 6. Dignosing problems 191

206 A duplicte WWNN hs been detected. Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 556. See the detils of node error 556 for more informtion. See node error UPS mbient temperture threshold exceeded Explntion: The system UPS hs reported n mbient over temperture. 1. Power off the node ttched to the UPS. 2. Turn off the UPS, nd then unplug the UPS from the min power source. 3. Ensure tht the UPS ir vents re not obstructed. 4. Ensure tht the ir flow round the UPS is not restricted. 5. Wit for t lest five minutes, nd then restrt the UPS. If the problem remins, check the mbient temperture. Correct the problem. Otherwise, exchnge the FRU for new FRU. 6. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem with the uninterruptible power supply. 7. Go to repir verifiction MAP UPS-1U ssembly (50%) Other: The system mbient temperture is outside the specifiction (50%) 1138 Power supply unit input power filed. Explntion: Power Supply Unit input power filed. Check the power cord. 1. Check tht the power cord is plugged in. 2. Check tht the wll power is good. 3. Replce the power cble. 4. Replce the power supply unit. Power cord (20%) PSU (5%) Other: No FRU (75%) 1140 UPS AC input power fult Explntion: The UPS hs reported tht it hs problem with the input AC power. 1. Check the input AC power, whether it is missing or out of specifiction. Correct if necessry. Otherwise, exchnge the FRU for new FRU. 2. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem with the uninterruptible power supply. 3. Go to repir verifiction MAP. v UPS input power cble (10%) v Electronics ssembly (10%) Other: v The input AC power is missing (40%) v The input AC power is not in specifiction (40%) 1141 UPS AC input power fult Explntion: The UPS hs reported tht it hs problem with the input AC power. 1. Check the input AC power, whether it is missing or out of specifiction. Correct if necessry. Otherwise, exchnge the FRU for new FRU. 2. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem with the uninterruptible power supply. 3. Go to repir verifiction MAP. v UPS input power cble (10%) v UPS ssembly (10%) Other: v The input AC power is missing (40%) v The input AC power is not in specifiction (40%) 1145 UPS communictions fult Explntion: The signl connection between the system nd its UPS is filing. 192 SAN Volume Controller: Troubleshooting Guide

207 If other nodes tht re using this UPS re reporting this error, exchnge the UPS for new one. 2. If only this node is reporting the problem, check the signl cble nd exchnge the FRUs for new FRUs, one t time. 3. Check the node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem. 4. Go to the repir verifiction MAP UPS communictions fult Explntion: The signl connection between node nd its UPS is filing. 1. In the sequence tht is shown in the log, replce ny filing FRUs with new FRUs. 2. Check node sttus: v If ll nodes show sttus of online, mrk the error s fixed. v If ny nodes do not show sttus of online, go to the strt MAP. v If you return to this step, contct your support center to resolve the problem with the node. 3. Go to the repir verifiction MAP UPS configurtion error Explntion: Dt tht the system received from the UPS suggests tht the UPS power cble, the signl cble, or both, re not connected correctly. 1. Connect the cbles correctly. See your product instlltion guide. 2. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem with the uninterruptible power supply. 3. Go to repir verifiction MAP. v None Other: v Configurtion error 1151 UPS configurtion error Explntion: Dt tht the system received from the UPS suggests tht the UPS power cble, the signl cble, or both, re not connected correctly. 1. Connect the cbles correctly. See your product's instlltion guide. 2. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem with the uninterruptible power supply. 3. Go to repir verifiction MAP. v None Other: v Configurtion error 1155 A power domin error hs occurred. Explntion: Both 2145s of pir re powered by the sme uninterruptible power supply. 1. List the 2145s of the cluster nd check tht 2145s in the sme I/O group re connected to different uninterruptible power supply. 2. Connect one of the 2145s s identified in step 1 to different uninterruptible power supply. 3. Mrk the error tht you hve just repired, fixed. 4. Go to repir verifiction MAP. v None Other: v Configurtion error 1160 UPS output overcurrent Explntion: The UPS reports tht too much power is being drwn from it. The power overlod wrning LED, which is bove the lod level indictors on the UPS, will be lit. 1. Determine the UPS tht is reporting the error from the error event dt. Perform the following steps on just this UPS. 2. Check tht the UPS is still reporting the error. If the power overlod wrning LED is no longer on, go to step 6. Chpter 6. Dignosing problems 193

208 Ensure tht only pproprite systems re receiving power from the UPS. Ensure tht there re no switches or disk controllers tht re connected to the UPS. 4. Remove ech connected input power in turn until the output overlod is removed. 5. Exchnge the FRUs for new FRUs in the sequence shown, on the overcurrent system. 6. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem. 7. Go to repir verifiction MAP. v Power cble ssembly (50%) v Power supply ssembly (40%) v UPS electronics ssembly (10%) 1166 UPS output lod high Explntion: The uninterruptible power supply output is possibly connected to mismtched device. 1. Ensure tht there re no other devices tht re connected to the UPS. 2. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem with the 2145 UPS-1U. 3. Go to repir verifiction MAP. v UPS ssembly (5%) Other: v Configurtion error (95%) 1175 A problem hs occurred with the uninterruptible power supply frme fult (reported by uninterruptible power supply lrm bits). Explntion: A problem hs occurred with the uninterruptible power supply frme fult (reported by the uninterruptible power supply lrm bits). 1. Replce the uninterruptible power supply ssembly. 2. Check node sttus. If ll nodes show sttus of online, mrk the error tht you hve just repired fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem with the uninterruptible power supply. 3. Go to repir verifiction MAP. Uninterruptible power supply ssembly (100%) 1179 Too mny drives ttched to the system. Explntion: The cluster only supports fixed number of drives. A drive hs been dded tht mkes the number of drives lrger thn the totl number of supported drives per cluster. 1. Disconnect ny excessive unmnged enclosures from the system. 2. Unmnge ny offline drives tht re not present in the system. 3. Identify unused drives nd remove them from the enclosures. 4. Identify rrys of drives tht re no longer required. 5. Remove the rrys nd remove the drives from the enclosures if they re present. 6. Once there re fewer thn 4096 drives in the system, consider re-engineering system cpcity by migrting dt from smll rrys onto lrge rrys, then removing the smll rrys nd the drives tht formed them. Consider the need for n dditionl Storwize system in your SAN solution Ambient temperture is too high during system strtup. Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 528. See the detils of node error 528 for more informtion. See node error The nodes hrdwre configurtion does not meet the minimum requirements. Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 562. See the detils of node error 562 for more informtion. See node error Node softwre is inconsistent or dmged Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node errors 523, 573, 574. See the detils of node errors 523, 573, 574 for more informtion. See node errors 523, 573, SAN Volume Controller: Troubleshooting Guide

209 Too mny softwre crshes hve occurred. Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 564. See the detils of node error 564 for more informtion. See node error The node is held in the service stte. Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 690. See the detils of node error 690 for more informtion. See node error Unexpected node error Explntion: A node is missing from the cluster. The error tht it is reporting is not recognized by the system. Find the node tht is in service stte nd use the service ssistnt to determine why it is not ctive Insufficient uninterruptible power supply chrge Explntion: The cluster is reporting tht node is not opertionl becuse of criticl node error 587, indicting tht n incorrect type of UPS ws instlled. correct type. Exchnge the UPS for one of the 1194 Automtic recovery of offline node hs filed. Explntion: The cluster hs n offline node nd hs determined tht one of the cndidte nodes mtches the chrcteristics of the offline node. The cluster hs ttempted but filed to dd the node bck into the cluster. The cluster hs stopped ttempting to utomticlly dd the node bck into the cluster. If node hs incomplete stte dt, it remins offline fter it strts. This occurs if the node hs hd loss of power or hrdwre filure tht prevented it from completing the writing of ll of the stte dt to disk. The node reports node error 578 when it is in this stte. If three ttempts to utomticlly dd mtching cndidte node to cluster hve been mde, but the node hs not returned online for 24 hours, the cluster stops utomtic ttempts to dd the node nd logs error code 1194 Automtic recovery of offline node filed. Two possible scenrios when this error event is logged re: 1. The node hs filed without sving ll of its stte dt. The node hs restrted, possibly fter repir, nd shows node error 578 nd is cndidte node for joining the cluster. The cluster ttempts to dd the node into the cluster but does not succeed. After 15 minutes, the cluster mkes second ttempt to dd the node into the cluster nd gin does not succeed. After nother 15 minutes, the cluster mkes third ttempt to dd the node into the cluster nd gin does not succeed. After nother 15 minutes, the cluster logs error code The node never cme online during the ttempts to dd it to the cluster. 2. The node hs filed without sving ll of its stte dt. The node hs restrted, possibly fter repir, nd shows node error 578 nd is cndidte node for joining the cluster. The cluster ttempts to dd the node into the cluster nd succeeds nd the node becomes online. Within 24 hours the node fils gin without sving its stte dt. The node restrts nd shows node error 578 nd is cndidte node for joining the cluster. The cluster gin ttempts to dd the node into the cluster, succeeds, nd the node becomes online; however, the node gin fils within 24 hours. The cluster ttempts third time to dd the node into the cluster, succeeds, nd the node becomes online; however, the node gin fils within 24 hours. After nother 15 minutes, the cluster logs error code A combintion of these scenrios is lso possible. Note: If the node is mnully removed from the cluster, the count of utomtic recovery ttempts is reset to zero. 1. If the node hs been continuously online in the cluster for more thn 24 hours, mrk the error s fixed nd go to the Repir Verifiction MAP. 2. Determine the history of events for this node by locting events for this node nme in the event log. Note tht the node ID will chnge, so mtch on the WWNN nd node nme. Also, check the service records. Specificlly, note entries indicting one of three events: 1) the node is missing from the cluster (cluster error 1195 event ), 2) n ttempt to utomticlly recover the offline node is strting (event ), 3) the node hs been dded to the cluster (event ). 3. If the node hs not been dded to the cluster since the recovery process strted, there is probbly hrdwre problem. The node's internl disk might be filing in mnner tht it is unble to modify its softwre level to mtch the softwre level of the cluster. If you hve not yet determined the root cuse of the problem, you cn ttempt to mnully remove the node from the cluster nd dd the node bck into the cluster. Continuously monitor the sttus of the nodes in the cluster while the cluster is Chpter 6. Dignosing problems 195

210 ttempting to dd the node. Note: If the node type is not supported by the softwre version of the cluster, the node will not pper s cndidte node. Therefore, incomptible hrdwre is not potentil root cuse of this error. 4. If the node ws dded to the cluster but filed gin before it hs been online for 24 hours, investigte the root cuse of the filure. If no events in the event log indicte the reson for the node filure, collect dumps nd contct IBM technicl support for ssistnce. 5. When you hve fixed the problem with the node, you must use either the cluster console or the commnd line interfce to mnully remove the node from the cluster nd dd the node into the cluster. 6. Mrk the error s fixed nd go to the verifiction MAP. None, lthough investigtion might indicte hrdwre filure Node missing. Explntion: You cn resolve this problem by repiring the filure on the missing If it is not obvious which node in the cluster hs filed, check the sttus of the nodes nd find the 3700 with sttus of offline. 2. Go to the Strt MAP nd perform the repir on the filing node. 3. When the repir hs been completed, this error is utomticlly mrked s fixed. 4. Check node sttus. If ll nodes show sttus of online, but the error in the log hs not been mrked s fixed, mnully mrk the error tht you hve just repired fixed. If ny nodes do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem with the Go to repir verifiction MAP. v None 1198 Detected hrdwre is not vlid configurtion. Explntion: A hrdwre chnge ws mde to this node tht is not supported by its softwre. Either hrdwre component filed, or the node ws incorrectly upgrded. Complete the following steps: 1. If required, power the node off for servicing. 2. If new hrdwre is correctly instlled, but it is listed s n invlid configurtion, then updte the softwre to level tht supports the new hrdwre. Use the mngement GUI to instll this level if necessry. 3. If you upgrded the softwre to mke the hrdwre work, there is new event fter the upgrde requesting tht you enble the new hrdwre. v None 1200 The configurtion is not vlid. Too mny devices, MDisks, or trgets hve been presented to the system. Explntion: The configurtion is not vlid. Too mny devices, MDisks, or trgets hve been presented to the system. 1. Remove unwnted devices from the Fibre Chnnel network fbric. 2. Strt cluster discovery opertion to find devices/disks by rescnning the Fibre Chnnel network. 3. List ll connected mnged disks. Check with the customer tht the configurtion is s expected. Mrk the error tht you hve just repired fixed. 4. Go to repir verifiction MAP. v None Other: Fibre Chnnel network fbric fult (100%) 1201 A flsh drive requires recovery. Explntion: The flsh drive tht is identified by this error needs to be recovered. To recover this flsh drive, submit the following commnd: chdrive -tsk recover drive_id where drive_id is the identity of the drive tht needs to be recovered A flsh drive is missing from the configurtion. Explntion: The offline flsh drive identified by this error must be repired. In the mngement GUI, click Troubleshooting > Recommended Actions to run the recommended ction for this error. Otherwise, use MAP 6000 to replce the drive. 196 SAN Volume Controller: Troubleshooting Guide

211 A duplicte Fibre Chnnel frme hs been received. Explntion: A duplicte Fibre Chnnel frme should never be detected. Receiving duplicte Fibre Chnnel frme indictes tht there is problem with the Fibre Chnnel fbric. Other errors relted to the Fibre Chnnel fbric might be generted. 1. Use the trnsmitting nd receiving WWPNs indicted in the error dt to determine the section of the Fibre Chnnel fbric tht hs generted the duplicte frme. Serch for the cuse of the problem by using fbric monitoring tools. The duplicte frme might be cused by design error in the topology of the fbric, by configurtion error, or by softwre or hrdwre fult in one of the components of the Fibre Chnnel fbric, including inter-switch links. 2. When you re stisfied tht the problem hs been corrected, mrk the error tht you hve just repired fixed. 3. Go to MAP 5700: Repir verifiction. v Fibre Chnnel cble ssembly (1%) v Fibre Chnnel dpter (1%) Other: v Fibre Chnnel network fbric fult (98%) 1210 A locl Fibre Chnnel port hs been excluded. Explntion: excluded. A locl Fibre Chnnel port hs been 1. Repir fults in the order shown. 2. Check the sttus of the disk controllers. If ll disk controllers show good sttus, mrk the error tht you just repired s fixed. 3. Go to repir verifiction MAP. v Fibre Chnnel cble ssembly (75%) v Smll Form-fctor Pluggble (SFP) connector (10%) v Fibre Chnnel dpter (5%) Other: v Fibre Chnnel network fbric fult (10%) 1212 Power supply exceeded temperture threshold. Explntion: threshold. Power supply exceeded temperture Complete the following steps: 1. Check irflow. Remove the top of the mchine cse nd check for missing bffles or internl blockges. 2. If problem persists, replce the power supply. v Power supply 1213 Boot drive missing, out of sync, or filed. Explntion: Boot drive missing, out of sync, or filed. Complete the following steps: 1. Look t boot drive view to determine the missing, filed or out of sync drive. 2. Insert missing drive. 3. Replce filed drive. 4. Synchronize n out of sync drive by running the commnds svctsk chnodebootdrive -sync nd/or stsk chbootdrive -sync. v System drive 1214 Boot drive is in the wrong slot. Explntion: Boot drive is in the wrong slot. Complete the following steps: 1. Look t boot drive view to determine which drive is in the wrong slot, which node nd slot it belongs in, nd which drive must be in this slot. 2. Swp the drive for the correct one but shut down the node first if booted yes is shown for tht drive in boot drive view. 3. If you wnt to use the drive in this node, synchronize the boot drives by running the commnds svctsk chnodebootdrive -sync nd/or stsk chbootdrive -sync. 4. The node error clers, or new node error is displyed for you to work on. v None 1215 A flsh drive is filing. Explntion: The flsh drive hs detected fults tht indicte tht the drive is likely to fil soon. The drive should be replced. The cluster event log will identify drive ID for the flsh drive tht cused the error. Chpter 6. Dignosing problems 197

212 In the mngement GUI, click Troubleshooting > Recommended Actions to run the recommended ction for this error. If this does not resolve the issue, contct your next level of support SAS errors hve exceeded thresholds. Explntion: The cluster hs experienced lrge number of SAS communiction errors, which indictes fulty SAS component tht must be replced. In the sequence shown, exchnge the FRUs for new FRUs. Go to the repir verifiction MAP. 1. SAS Cble (70%) 2. High speed SAS dpter (20%) 3. SAS drive bckplne (5%) 4. flsh drive (5%) 1217 A flsh drive hs exceeded the temperture wrning threshold. Explntion: The flsh drive identified by this error hs reported tht its temperture is higher thn the wrning threshold. of the drive. Tke steps to reduce the temperture 1. Determine the temperture of the room, nd reduce the room temperture if this ction is pproprite. 2. Replce ny filed fns. 3. Ensure tht there re no obstructions to ir flow for the node. 4. Mrk the error s fixed. If the error recurs, contct hrdwre support for further investigtion. v Flsh drive (10%) Other: v System environment or irflow blockge (90%) 1220 A remote Fibre Chnnel port hs been excluded. Explntion: excluded. A remote Fibre Chnnel port hs been 1. View the event log. Note the MDisk ID ssocited with the error code. 2. From the MDisk, determine the filing disk controller ID. 3. Refer to the service documenttion for the disk controller nd the Fibre Chnnel network to resolve the reported problem. 4. After the disk drive is repired, strt cluster discovery opertion to recover the excluded Fibre Chnnel port by rescnning the Fibre Chnnel network. 5. To restore MDisk online sttus, include the mnged disk tht you noted in step Check the sttus of the disk controller. If ll disk controllers show good sttus, mrk the error tht you hve just repired, fixed. 7. If ll disk controllers do not show good sttus, contct your support center to resolve the problem with the disk controller. 8. Go to repir verifiction MAP. v None Other: v Enclosure/controller fult (50%) v Fibre Chnnel network fbric (50%) 1230 A login hs been excluded. Explntion: A port to port fbric connection, or login, between the cluster node nd either controller or nother cluster hs hd excessive errors. The login hs therefore been excluded, nd will not be used for I/O opertions. Determine the remote system, which might be either controller or cluster. Check the event log for other 1230 errors. Ensure tht ll higher priority errors re fixed. This error event is usully cused by fbric problem. If possible, use the fbric switch or other fbric dignostic tools to determine which link or port is reporting the errors. If there re error events for links from this node to number of different controllers or clusters, then it is probbly the node to switch link tht is cusing the errors. Unless there re other contrry indictions, first replce the cble between the switch nd the remote system. 1. From the fbric nlysis, determine the FRU tht is most likely cusing the error. If this FRU hs recently been replced while resolving 1230 error, choose the next most likely FRU tht hs not been replced recently. Exchnge the FRU for new FRU. 2. Mrk the error s fixed. If the FRU replcement hs not fixed the problem, the error will be logged gin; however, depending on the severity of the problem, the error might not be logged gin immeditely. 3. Strt cluster discovery opertion to recover the login by re-scnning the Fibre Chnnel network. 4. Check the sttus of the disk controller or remote cluster. If the sttus is not good, go to the Strt MAP. 198 SAN Volume Controller: Troubleshooting Guide

213 Go to repir verifiction MAP. v Fibre Chnnel cble, switch to remote port, (30%) v Switch or remote device SFP connector or dpter, (30%) v Fibre Chnnel cble, locl port to switch, (30%) v Cluster SFP connector, (9%) v Cluster Fibre Chnnel dpter, (1%) Note: The first two FRUs re not cluster FRUs SAS cble fult type 2. Explntion: The ssocited lert event contins more informtion bout the error: SAS cble excluded due to internl errors The cble ws excluded becuse one or more phys (lnes of communiction) re missing SAS cble excluded due to cusing too mny chnge events The connector port cused too mny chnge events SAS cble is operting t reduced speed If the cble is not the lst pth to dt, reduced speed cuses it to be excluded SAS cble excluded due to dropped frmes Frme errors occurred SAS cble excluded due to enclosure discovery timing out Enclosure discovery timed out before the cble could be identified SAS cble excluded due to Single Port Active drives The connector or the ttched cnister might be the cuse of multiple single-ported drives Attempts to exclude connector hve filed Multiple ttempts to exclude the filing connector did not chnge the connector stte SAS cble is not working t full cpcity Some of the physicl dt pths in the cble re not working properly. This error is logged only if no other events re logged on the cble. In ll cses, the user response is the sme. Complete the following steps: Note: After ech ction, check to see whether the cnister ports t both ends of the cble re excluded. If the ports re excluded, then enble them by issuing the following commnd: chenclosurecnister -excludessport no -port X 1. Reset this cnister nd the upstrem cnister. The upstrem cnister is identified in sense dt s enclosureid2, fultobjectloction Reset the cble between the two ports tht re identified in the sense dt. 3. Replce the cble between the two ports tht re identified in the sense dt. 4. Replce this cnister. 5. Replce the other cnister (enclosureid2). v SAS cble v Cnister 1266 SEM Fult Type 1 Explntion: An unrecoverble error occurred involving secondry expnder module (SEM). The SEM must be replced. Complete the following steps: 1. Enble mintennce mode for the I/O group. 2. Slide the enclosure out of the rck sufficiently to open the ccess lid. 3. Remove the filed SEM. 4. Insert the replcement SEM. 5. Close the ccess lid. 6. Slide the enclosure bck into the rck. 7. Mintennce mode will disble utomticlly fter 30 minutes, or you cn disble it mnully 8. If the error does not utofix, contct your service support representtive Enclosure secondry expnder module is missing Explntion: An error occurred involving secondry expnder module (SEM). You might be ble to resolve the problem by reseting the SEM. The lert event gives more informtion bout the error Enclosure secondry expnder module hs filed A SEM is offline nd might hve filed Enclosure secondry expnder module temperture sensor cnnot be red A SEM temperture sensor could not be red Enclosure secondry expnder module connector excluded due to too mny chnge events A SEM is in degrded stte due to too mny trnsient errors Enclosure secondry expnder module is missing A SEM ws removed from the disk drwer for n enclosure Enclosure secondry expnder module connector excluded due to dropped frmes An internl SAS connector in the enclosure is in degrded stte due to too mny Virtul LUN Mnger login errors. Chpter 6. Dignosing problems 199

214 Enclosure secondry expnder module connector is excluded nd cnnot be unexcluded An internl SAS connector in the enclosure ws excluded nd cnnot be included Enclosure secondry expnder module connectors excluded s the cuse of single ported drives SEM connectors were excluded becuse slot ports under them were unrechble Enclosure secondry expnder module lef expnder connector excluded s the cuse of single ported drives An SEM lef expnder connector ws excluded becuse slot ports under it were unrechble. 1. Reset the SEM: Complete the following steps:. Enble mintennce mode for the I/O group. b. Slide the enclosure out of the rck sufficiently to open the ccess lid. c. Remove the designted SEM. d. Reinsert the designted SEM. e. Mintennce mode will disble utomticlly fter 30 minutes, or you cn disble it mnully. 2. If the error utofixes, close up the enclosure:. Close the ccess lid. b. Slide the enclosure bck into the rck. 3. If the error does not utofix, replce the SEM:. Enble mintennce mode for the I/O group. b. Slide the enclosure out of the rck sufficiently to open the ccess lid. c. Remove the filed SEM. d. Insert the replcement SEM. e. Close the ccess lid. f. Slide the enclosure bck into the rck. g. Mintennce mode will disble utomticlly fter 30 minutes, or you cn disble it mnully Enclosure Disply Pnel Fult Type 2 Explntion: A problem ws found with the disply pnel for the enclosure. The lert event gives more informtion bout the error Enclosure disply pnel is not instlled The disply pnel is offline nd might be missing Enclosure disply pnel temperture sensor cnnot be red The temperture sensor for the disply pnel could not be red Enclosure disply pnel VPD cnnot be red The Vitl Product Dt (VPD) for the disply pnel could not be red. Complete the following steps: 1. Reset the disply pnel:. Put the system into mintennce mode. b. Slide the enclosure out of the rck sufficiently to remove the top cover nd remove the top cover. c. Locte the disply pnel ccess hndle. d. Pinch the sides of the disply pnel hndle nd remove the disply pnel module e. Reinsert the disply pnel module. f. Replce the cover nd slide the enclosure bck into the rck. g. Turn off mintennce mode. 2. If the error does not cler, replce the disply pnel:. Turn on mintennce mode. b. Slide the enclosure out of the rck sufficiently to remove the top cover nd remove the top cover. c. Locte the disply pnel ccess hndle. d. Pinch the sides of the disply pnel hndle nd remove the disply pnel module. e. Insert the replcement disply pnel module. f. Replce the cover nd slide the enclosure bck into the rck. g. Turn off mintennce mode 3. If the error does not cler, the enclosure might need to be replced. Contct your service support representtive A node hs encountered n error updting. Explntion: updte. One or more nodes hs filed the Check lsupdte for the node tht filed nd continue troubleshooting with the error code it provides IO port configurtion issue Explntion: A port tht ws configured for N_Port ID virtuliztion (NPIV) is off line. procedures: Complete both of the following 1. Check the switch configurtion to ensure tht NPIV is enbled nd tht resource limits re sufficient. 2. Run the detectmdisks commnd nd wit 30 seconds fter the discovery completes to see if the event fixes itself. 3. If the event does not fix itself, contct IBM Support A mnged disk is reporting excessive errors. Explntion: errors. A mnged disk is reporting excessive 200 SAN Volume Controller: Troubleshooting Guide

215 Repir the enclosure/controller fult. 2. Check the mnged disk sttus. If ll mnged disks show sttus of online, mrk the error tht you hve just repired s fixed. If ny mnged disks show sttus of excluded, include the excluded mnged disks nd then mrk the error s fixed. 3. Go to repir verifiction MAP. v None Other: Enclosure/controller fult (100%) 1311 A flsh drive is offline due to excessive errors. Explntion: The drive tht is reporting excessive errors hs been tken offline. In the mngement GUI, click Troubleshooting > Recommended Actions to run the recommended ction for this error. If this does not resolve the issue, contct your next level of support A disk I/O medium error hs occurred. Explntion: A disk I/O medium error hs occurred. 1. Check whether the volume the error is reported ginst is mirrored. If it is, check if there is 1870 Mirrored volume offline becuse hrdwre red error hs occurred error relting to this volume in the event log. Also check if one of the mirror copies is synchronizing. If ll these tests re true then you must delete the volume copy tht is not synchronized from the volume. Check tht the volume is online before continuing with the following ctions. Wit until the medium error is corrected before trying to re-crete the volume mirror. 2. If the medium error ws detected by red from host, sk the customer to rewrite the incorrect dt to the block logicl block ddress (LBA) tht is reported in the host systems SCSI sense dt. If n individul block cnnot be recovered it will be necessry to restore the volume from bckup. (If this error hs occurred during migrtion, the host system does not notice the error until the trget device is ccessed.) 3. If the medium error ws detected during mirrored volume synchroniztion, the block might not be being used for host dt. The medium error must still be corrected before the mirror cn be estblished. It my be possible to fix the block tht is in error using the disk controller or host tools. Otherwise, it will be necessry to use the host tools to copy the volume content tht is being used to new volume. Depending on the circumstnces, this new volume cn be kept nd mirrored, or the originl volume cn be repired nd the dt copied bck gin. 4. Check mnged disk sttus. If ll mnged disks show sttus of online, mrk the error tht you hve just repired s fixed. If ny mnged disks do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem with the disk controller. 5. Go to repir verifiction MAP. v None Other: Enclosure/controller fult (100%) 1322 Dt protection informtion mismtch. Explntion: This error occurs when something hs broken the protection informtion in red or write commnds. 1. Determine if there is single or multiple drives logging the error. Becuse the SAS trnsport lyer cn cuse multiple drive errors, it is necessry to fix other hrdwre errors first. 2. Check relted higher priority hrdwre errors. Fix higher priority errors before continuing. 3. Use lseventlog to determine if more thn one drive with this error hs been logged in the lst 24 hours. If so, contct IBM support. 4. If only single drive with this error hs been logged, the system is monitoring the drive for helth nd will fil if RAID is used to correct too mny errors of this kind Encryption key required. Explntion: It is necessry to provide n encryption key before the system cn become fully opertionl. This error occurs when system with encryption enbled is restrted without n encryption key vilble. Connect USB flsh drive or key server tht contins the current key for this system to one or more of the nodes A suitble mnged disk (MDisk) or drive for use s quorum disk ws not found. Explntion: A quorum disk is needed to enble tie-brek when some cluster members re missing. Three quorum disks re usully defined. By defult, the Chpter 6. Dignosing problems 201

216 cluster utomticlly lloctes quorum disks when mnged disks re creted; however, the option exists to mnully ssign quorum disks. This error is reported when there re mnged disks or imge mode disks but no quorum disks. To become quorum disk: v The MDisk must be ccessible by ll nodes in the cluster. v The MDisk must be mnged; tht is, it must be member of storge pool. v The MDisk must hve free extents. v The MDisk must be ssocited with controller tht is enbled for quorum support. If the controller hs multiple WWNNs, ll of the controller components must be enbled for quorum support. A quorum disk might not be vilble becuse of Fibre Chnnel network filure or becuse of Fibre Chnnel switch zoning problem. 1. Resolve ny known Fibre Chnnel network problems. 2. Ask the customer to confirm tht MDisks hve been dded to storge pools nd tht those MDisks hve free extents nd re on controller tht is enbled for use s provider of quorum disks. Ensure tht ny controller with multiple WWNNs hs ll of its components enbled to provide quorum disks. Either crete suitble MDisk or if possible enble quorum support on controllers with which existing MDisks re ssocited. If t lest one mnged disk shows mode of mnged nd hs non-zero quorum index, mrk the error tht you hve just repired s fixed. 3. If the customer is unble to mke the pproprite chnges, sk your softwre support center for ssistnce. 4. Go to repir verifiction MAP. v None Other: Configurtion error (100%) 1335 Quorum disk not vilble. Explntion: Quorum disk not vilble. 1. View the event log entry to identify the mnged disk (MDisk) being used s quorum disk, tht is no longer vilble. 2. Perform the disk controller problem determintion nd repir procedures for the MDisk identified in step Include the MDisks into the cluster. 4. Check the mnged disk sttus. If the mnged disk identified in step 1 shows sttus of online, mrk the error tht you hve just repired s fixed. If the mnged disk does not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem with the disk controller. 5. Go to repir verifiction MAP. v None Other: Enclosure/controller fult (100%) 1340 A mnged disk hs timed out. Explntion: This error ws reported becuse lrge number of disk timeout conditions hve been detected. The problem is probbly cused by filure of some other component on the SAN. 1. Repir problems on ll enclosures or controllers nd switches on the sme SAN s this 2145 cluster. 2. If problems re found, mrk this error s fixed. 3. If no switch or disk controller filures cn be found, tke n event log dump nd cll your hrdwre support center. 4. Go to repir verifiction MAP. v None Other: v Enclosure/controller fult v Fibre Chnnel (FC) switch 1350 IB ports re not opertionl. Explntion: IB ports re not opertionl. An offline port cn hve mny cuses nd so it is necessry to check them ll. Strt with the esiest nd lest intrusive possibility. 1. Reset the IB port with CLI commnd. 2. If the IB port is connected to switch, double-check the switch configurtion for issues. 3. Reset the IB cble on both the IB side nd the HBA/switch side. 4. Run temporry second IB cble to replce the current one to check for cble fult. 5. If the system is in production, schedule mintennce downtime before continuing to the next step. Other ports will be ffected. 202 SAN Volume Controller: Troubleshooting Guide

217 Reset the IB interfce dpter; reset the node; reboot the system. Externl (cble, HCA, switch, nd so on) (85%) Interfce (10%) Node (5%) 1360 A SAN trnsport error occurred. Explntion: This error hs been reported becuse the 2145 performed error recovery procedures in response to SAN component ssocited trnsport errors. The problem is probbly cused by filure of some other component on the SAN. 1. View the event log entry to determine the node tht logged the problem. Determine the 2145 node or controller tht the problem ws logged ginst. 2. Perform Fibre Chnnel (FC) switch problem determintion nd repir procedures for the switches connected to the 2145 node or controller. 3. Perform FC cbling problem determintion nd repir procedures for the cbles connected to the 2145 node or controller. 4. If ny problems re found nd resolved in step 2 nd 3, mrk this error s fixed. 5. If no switch or cble filures were found in steps 2 nd 3, tke n event log dump. Cll your hrdwre support center. 6. Go to repir verifiction MAP. v None Other: v FC switch v FC cbling 1370 A mnged disk error recovery procedure (ERP) hs occurred. Explntion: This error ws reported becuse lrge number of disk error recovery procedures hve been performed by the disk controller. The problem is probbly cused by filure of some other component on the SAN. 1. View the event log entry nd determine the mnged disk tht ws being ccessed when the problem ws detected. 2. Perform the disk controller problem determintion nd repir procedures for the MDisk determined in step Perform problem determintion nd repir procedures for the Fibre Chnnel (FC) switches connected to the 2145 nd ny other FC network components. 4. If ny problems re found nd resolved in steps 2 nd 3, mrk this error s fixed. 5. If no switch or disk controller filures were found in steps 2 nd 3, tke n event log dump. Cll your hrdwre support center. 6. Go to repir verifiction MAP. v None Other: v Enclosure/controller fult v Fibre Chnnel (FC) switch 1400 Ethernet port filure Explntion: connection. 1. Go to the Ethernet MAP. The system cnnot detect n Ethernet 2. Go to the repir verifiction MAP Externl port not opertionl. Explntion: If this error occurs when port ws initilly online nd subsequently went offline, it indictes: v the server, HBA, CNA or switch hs been turned off. v there is physicl issue. If this error occurs during n initil setup or during setup chnge, it is most likely configurtion issue rther thn physicl issue. 1. Reset the port vi the CLI commnd Mintennce. If the port is now online, the DMP is complete. 2. If the port is connected to switch, check the switch to mke sure the port is not disbled. Check the switch vendor troubleshooting documenttion for other possibilities. If the port is now online, the DMP is complete. 3. Reset the cble. This includes plugging in the cble nd SFP if not lredy done. If the port is now online, the DMP is complete. 4. Reset the hot swp SFPs (optics modules). If the port is now online, the DMP is complete. 5. Try using new cble. 6. Try using new SFP. Chpter 6. Dignosing problems 203

218 Try using new port on the switch. Note: Continuing from here will ffect other ports connected on the dpter. 8. Reset the dpter. 9. Reset the node Cloud gtewy service restrted too often Explntion: The system reported persistent error with the cloud gtewy service. Cloud storge functions re not vilble. Try the following ctions: 1. Check the IP network. For exmple, ensure tht ll network switches report good sttus. 2. Updte the system to the ltest code. 3. If the problem persists, contct your service support representtive Fewer Fibre Chnnel I/O ports opertionl. Explntion: One or more Fibre Chnnel I/O ports tht hve previously been ctive re now inctive. This sitution hs continued for one minute. A Fibre Chnnel I/O port might be estblished on either Fibre Chnnel pltform port or n Ethernet pltform port using FCoE. This error is expected if the ssocited Fibre Chnnel or Ethernet port is not opertionl. Dt: Three numeric vlues re listed: v The ID of the first unexpected inctive port. This ID is deciml number. v The ports tht re expected to be ctive, which is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is expected to be ctive. v The ports tht re ctully ctive, which is hexdeciml number. Ech bit position represents port, with the lest significnt bit representing port 1. The bit is 1 if the port is ctive. 1. If possible, use the mngement GUI to run the recommended ctions for the ssocited service error code. 2. Follow the procedure for mpping I/O ports to pltform ports to determine which pltform port is providing this I/O port. 3. Check for ny 704 (Fibre chnnel pltform port not opertionl) or 724 (Ethernet pltform port not opertionl) node errors reported for the pltform port. 4. Possibilities: v If the port hs been intentionlly disconnected, use the mngement GUI recommended ction for the service error code nd cknowledge the intended chnge. v Resolve the 704 or 724 error. v If this is n FCoE connection, use the informtion the view gives bout the Fibre Chnnel forwrder (FCF) to troubleshoot the connection between the port nd the FCF. Possible Cuse-FRUs or other cuse: v None 1471 Interfce crd is unsupported. Explntion: Interfce dpter is unsupported. Replce the wrong interfce dpter with the correct type. Interfce dpter (100%) 1472 Boot drive is in n unsupported slot. Explntion: Boot drive is in n unsupported slot. Complete the following steps: 1. Look t boot drive view to determine which drive is in n unsupported slot. 2. Move the drive bck to its correct node nd slot, but shut down the node first if booted yes is shown for tht drive in boot drive view. 3. The node error clers, or new node error is displyed for you to work on. v None 1473 The instlled bttery is t hrdwre revision level tht is not supported by the current code level. Explntion: The instlled bttery is t hrdwre revision level tht is not supported by the current code level. To replce the bttery with one tht is supported by the current code level, follow the service ction for 1130 on pge 191. To updte the code level to one tht supports the currently instlled bttery, perform service mode code updte. Alwys instll the ltest level of the system softwre to void problems with upgrdes nd component comptibility. v Bttery (50%) 204 SAN Volume Controller: Troubleshooting Guide

219 Bttery is nering end of life. Explntion: When bttery ners the end of its life, you must replce it if you intend to preserve the cpcity to filover power to btteries. Replce the bttery by following this procedure s soon s you cn. If the node is in clustered system, ensure tht the bttery is not being relied upon to provide dt protection before you remove it. Issue the chnodebttery -remove -bttery bttery_id node_id commnd to estblish the lck of relince on the bttery. If the commnd returns with The commnd hs filed becuse the specified bttery is offline (BATTERY_OFFLINE) error, replce the bttery immeditely. If the commnd returns with The commnd hs filed becuse the specified bttery is not redundnt (BATTERY_NOT_REDUNDANT) error, do not remove the relied-on bttery. Removing the bttery compromises dt protection. In this cse, without other bttery-relted errors, use the chnodebttery -remove -bttery bttery_id node_id commnd periodiclly to force the system to remove relince on the bttery. The system often removes relince within one hour (TBC). Alterntively, remove the node from the clustered system. Once the node is independent, you cn replce its bttery immeditely. If the node is not prt of cluster, or the bttery is offline, or the chnodebttery commnd returns without error, conduct the service ction for 1130 on pge 191. v Bttery (100%) 1475 Bttery is too hot. Explntion: Bttery is too hot. The bttery might be slow to cool if the mbient temperture is high. You must wit for the bttery to cool down before it cn resume its norml opertion. If node error 768 is reported, service tht s well Bttery is too cold. Explntion: You must wit for the bttery to wrm before it cn resume its norml opertion. The bttery might be slow to wrm if the mbient temperture is low. If node error 768 is reported, service tht s well. Otherwise, wit for the bttery to wrm A cluster pth hs filed. Explntion: One of the Fibre Chnnel ports is unble to communicte with ll of the other ports in the cluster. 1. Check for incorrect switch zoning. 2. Repir the fult in the Fibre Chnnel network fbric. 3. Check the sttus of the node ports tht re not excluded vi the system's locl port msk. If the sttus of the node ports shows s ctive, mrk the error tht you hve repired s fixed. If ny node ports do not show sttus of ctive, go to strt MAP. If you return to this step contct your support center to resolve the problem. 4. Go to repir verifiction MAP. v None Other: Fibre Chnnel network fbric fult (100%) 1570 Quorum disk configured on controller tht hs quorum disbled Explntion: This error cn occur with storge controller tht cn be ccessed through multiple WWNNs nd hve defult setting of not llowing quorum disks. When these controllers re detected by cluster, lthough multiple component controller definitions re creted, the cluster recognizes tht ll of the component controllers belong to the sme storge system. To enble the cretion of quorum disk on this storge system, ll of the controller components must be configured to llow quorum. A configurtion chnge to the SAN, or to storge system with multiple WWNNs, might result in the cluster discovering new component controllers for the storge system. These components will tke the defult setting for llowing quorum. This error is reported if there is quorum disk ssocited with the controller nd the defult setting is not to llow quorum. v Determine if there should be quorum disk on this storge system. Ensure tht the controller supports quorum before you llow quorum disks on ny disk controller. You cn check for more informtion. v If quorum disk is required on this storge system, llow quorum on the controller component tht is reported in the error. If the quorum disk should not be on this storge system, move it elsewhere. v Mrk the error s fixed. Chpter 6. Dignosing problems 205

220 v None Other: Fibre Chnnel network fbric fult (100%) 1580 Hostnme cnnot be resolved Explntion: The system cnnot determine the IP ddress to connect to. Try the following ctions to determine the source of the problem: 1. Verify tht the configured DNS server settings re correct.. Check the output from the lsdnserver commnd nd verify tht the configured IP ddresses re correct. b. Try to ping the configured DNS servers by entering svctsk ping -srcip4 source_ip_ddress trget_ip_ddress. c. If the ping commnd fils, enter sinfo trceroute dns_server nd sve the output. Contct your service support representtive. 2. Verify tht DNS is working by entering sinfo host 3. Verify the host nme by entering sinfo host host_nme where host_nme is the nme of the host for which the error ws rised. If the system is ble to resolve this host nme, the issue is now resolved. Mnully mrk the lert s fixed. 4. If the system cnnot resolve the host nme, contct your service support representtive Could not connect to DNS server Explntion: An invlid DNS server IP ws provided, or the DNS server ws unresponsive. Try the following ctions: 1. Check the output from the lsdnserver commnd nd verify tht the configured IP ddresses re correct. 2. Try to ping the configured DNS servers by entering svctsk ping dns_server. 3. If the ping commnd fils, enter sinfo trceroute dns_server nd sve the output. Contct your service support representtive Invlid hostnme specified Explntion: An invlid host nme ws specified, or the DNS server ws not ble to resolve the host nme in its dtbse. Try the following ctions: 1. Check tht the host nme looks correct. 2. Try to ping the host by entering svctsk ping host_nme. 3. Verify tht DNS is working by entering sinfo host 4. Verify the host nme by entering sinfo host host_nme. If the system is ble to resolve this host nme, the issue is now resolved. Mnully mrk the lert s fixed. 5. If the system cnnot resolve the host nme, contct your service support representtive Mirrored disk repir hlted becuse of difference. Explntion: During the repir of mirrored volume two copy disks were found to contin different dt for the sme logicl block ddress (LBA). The vlidte option ws used, so the repir process hs hlted. Red opertions to the LBAs tht differ might return the dt of either volume copy. Therefore it is importnt not to use the volume unless you re sure tht the host pplictions will not red the LBAs tht differ or cn mnge the different dt tht potentilly cn be returned. Perform one of the following ctions: v Continue the repir strting with the next LBA fter the difference to see how mny differences there re for the whole mirrored volume. This cn help you decide which of the following ctions to tke. v Choose primry disk nd run repir resynchronizing differences. v Run repir nd crete medium errors for differences. v Restore ll or prt of the volume from bckup. v Decide which disk hs correct dt, then delete the copy tht is different nd re-crete it llowing it to be synchronized. Then mrk the error s fixed. v None 1610 There re too mny copied medi errors on mnged disk. Explntion: The cluster mintins virtul medium error tble for ech MDisk. This tble is list of logicl block ddresses on the mnged disk tht contin dt tht is not vlid nd cnnot be red. The virtul medium error tble hs fixed length. This error event indictes tht the system hs ttempted to dd n entry to the tble, but the ttempt hs filed becuse the tble is lredy full. There re two circumstnces tht will cuse n entry to be dded to the virtul medium error tble: 206 SAN Volume Controller: Troubleshooting Guide

221 FlshCopy, dt migrtion nd mirrored volume synchroniztion opertions copy dt from one mnged disk extent to nother. If the source extent contins either virtul medium error or the RAID controller reports rel medium error, the system cretes mtching virtul medium error on the trget extent. 2. The mirrored volume vlidte nd repir process hs the option to crete virtul medium errors on sectors tht do not mtch on ll volume copies. Normlly zero, or very few, differences re expected; however, if the copies hve been mrked s synchronized inppropritely, then lrge number of virtul medium errors could be creted. Ensure tht ll higher priority errors re fixed before you ttempt to resolve this error. Determine whether the excessive number of virtul medium errors occurred becuse of mirrored disk vlidte nd repir opertion tht creted errors for differences, or whether the errors were creted becuse of copy opertion. Follow the corresponding option shown below. 1. If the virtul medium errors occurred becuse of mirrored disk vlidte nd repir opertion tht creted medium errors for differences, then lso ensure tht the volume copies hd been fully synchronized prior to strting the opertion. If the copies hd been synchronized, there should be only few virtul medium errors creted by the vlidte nd repir opertion. In this cse, it might be possible to rewrite only the dt tht ws not consistent on the copies using the locl dt recovery process. If the copies hd not been synchronized, it is likely tht there re now lrge number of medium errors on ll of the volume copies. Even if the virtul medium errors re expected to be only for blocks tht hve never been written, it is importnt to cler the virtul medium errors to void inhibition of other opertions. To recover the dt for ll of these virtul medium errors it is likely tht the volume will hve to be recovered from bckup using process tht rewrites ll sectors of the volume. 2. If the virtul medium errors hve been creted by copy opertion, it is best prctice to correct ny medium errors on the source volume nd to not propgte the medium errors to copies of the volume. Fixing higher priority errors in the event log would hve corrected the medium error on the source volume. Once the medium errors hve been fixed, you must run the copy opertion gin to cler the virtul medium errors from the trget volume. It might be necessry to repet sequence of copy opertions if copies hve been mde of lredy copied medium errors. An lterntive tht does not ddress the root cuse is to delete volumes on the trget mnged disk tht hve the virtul medium errors. This volume deletion reduces the number of virtul medium error entries in the MDisk tble. Migrting the volume to different mnged disk will lso delete entries in the MDisk tble, but will crete more entries on the MDisk tble of the MDisk to which the volume is migrted. v None 1620 A storge pool is offline. Explntion: A storge pool is offline. 1. Repir the fults in the order shown. 2. Strt cluster discovery opertion by rescnning the Fibre Chnnel network. 3. Check mnged disk (MDisk) sttus. If ll MDisks show sttus of online, mrk the error tht you hve just repired s fixed. If ny MDisks do not show sttus of online, go to strt MAP. If you return to this step, contct your support center to resolve the problem with the disk controller. 4. Go to repir verifiction MAP. v None Other: v Fibre Chnnel network fbric fult (50%) v Enclosure/controller fult (50%) 1623 One or more MDisks on controller re degrded. Explntion: At lest one MDisk on controller is degrded becuse the MDisk is not vilble through one or more nodes. The MDisk is vilble through t lest one node. Access to dt might be lost if nother filure occurs. In correctly configured system, ech node ccesses ll of the MDisks on controller through ll of the controller's ports. This error is only logged once per controller. There might be more thn one MDisk on this controller tht hs been configured incorrectly, but the error is only logged for one MDisk. To prevent this error from being logged becuse of short-term fbric mintennce ctivities, this error condition must hve existed for one hour before the error is logged. 1. Determine which MDisks re degrded. Look for MDisks with pth count lower thn the number of nodes. Do not use only the MDisk sttus, since other errors cn lso cuse degrded MDisks. Chpter 6. Dignosing problems 207

222 Ensure tht the controller is zoned correctly with ll of the nodes. 3. Ensure tht the logicl unit is mpped to ll of the nodes. 4. Ensure tht the logicl unit is mpped to ll of the nodes using the sme LUN. 5. Run the console or CLI commnd to discover MDisks nd ensure tht the commnd completes. 6. Mrk the error tht you hve just repired s fixed. When you mrk the error s fixed, the controller's MDisk vilbility is tested nd the error will be logged gin immeditely if the error persists for ny MDisks. It is possible tht the new error will report different MDisk. 7. Go to repir verifiction MAP. v None Other: v Fibre Chnnel network fbric fult (50%) v Enclosure/controller fult (50%) 1624 Controller configurtion hs unsupported RDAC mode. Explntion: The cluster hs detected tht n IBM DS series disk controller's configurtion is not supported by the cluster. The disk controller is operting in RDAC mode. The disk controller might pper to be operting with the cluster; however, the configurtion is unsupported becuse it is known to not work with the cluster. 1. Using the IBM DS series console, ensure tht the host type is set to 'IBM TS SAN VCE' nd tht the AVT option is enbled. (The AVT nd RDAC options re mutully exclusive). 2. Mrk the error tht you hve just repired s fixed. If the problem hs not been fixed it will be logged gin; this could tke few minutes. 3. Go to repir verifiction MAP. v None Other: v Enclosure/controller fult 1625 Incorrect disk controller configurtion. Explntion: While running n MDisk discovery, the cluster hs detected tht disk controller's configurtion is not supported by the cluster. The disk controller might pper to be operting with the cluster; however, the configurtion detected cn potentilly cuse issues nd should not be used. The unsupported configurtion is shown in the event dt. 1. Use the event dt to determine chnges required on the disk controller nd reconfigure the disk controller to use supported configurtion. 2. Mrk the error tht you hve just repired s fixed. If the problem hs not been fixed it will be logged gin by the mnged disk discovery tht utomticlly runs t this time; this could tke few minutes. 3. Go to repir verifiction MAP. v None Other: v Enclosure/controller fult 1627 The cluster hs insufficient redundncy in its controller connectivity. Explntion: The cluster hs detected tht it does not hve sufficient redundncy in its connections to the disk controllers. This mens tht nother filure in the SAN could result in loss of ccess to the ppliction dt. The cluster SAN environment should hve redundnt connections to every disk controller. This redundncy llows for continued opertion when there is filure in one of the SAN components. To provide recommended redundncy, cluster should be configured so tht: v ech node cn ccess ech disk controller through two or more different inititor ports on the node. v ech node cn ccess ech disk controller through two or more different controller trget ports. Note: Some disk controllers only provide single trget port. v ech node cn ccess ech disk controller trget port through t lest one inititor port on the node. If there re no higher-priority errors being reported, this error usully indictes problem with the SAN design, problem with the SAN zoning or problem with the disk controller. If there re unfixed higher-priority errors tht relte to the SAN or to disk controllers, those errors should be fixed before resolving this error becuse they might indicte the reson for the lck of redundncy. Error codes tht must be fixed first re: v 1210 Locl FC port excluded v 1230 Login hs been excluded Note: This error cn be reported if the required ction, to rescn the Fibre Chnnel network for new MDisks, 208 SAN Volume Controller: Troubleshooting Guide

223 1627 hs not been performed fter deliberte reconfigurtion of disk controller or fter SAN rezoning. The 1627 error code is reported for number of different error IDs. The error ID indictes the re where there is lck of redundncy. The dt reported in n event log entry indictes where the condition ws found. The mening of the error IDs is shown below. For ech error ID the most likely reson for the condition is given. If the problem is not found in the suggested res, check the configurtion nd stte of ll of the SAN components (switches, controllers, disks, cbles nd cluster) to determine where there is single point of filure A disk controller is only ccessible from single node port. v A node hs detected tht it only hs connection to the disk controller through exctly one inititor port, nd more thn one inititor port is opertionl. v The error dt indictes the device WWNN nd the WWPN of the connected port. v A zoning issue or Fibre Chnnel connection hrdwre fult might cuse this condition A disk controller is only ccessible from single port on the controller. v A node hs detected tht it is only connected to exctly one trget port on disk controller, nd more thn one trget port connection is expected. v The error dt indictes the WWPN of the disk controller port tht is connected. v A zoning issue or Fibre Chnnel connection hrdwre fult might cuse this condition Only single port on disk controller is ccessible from every node in the cluster. v Only single port on disk controller is ccessible to every node when there re multiple ports on the controller tht could be connected. v The error dt indictes the WWPN of the disk controller port tht is connected. v A zoning issue or Fibre Chnnel connection hrdwre fult might cuse this condition A disk controller is ccessible through only hlf, or less, of the previously configured controller ports. v Although there might still be multiple ports tht re ccessible on the disk controller, hrdwre component of the controller might hve filed or one of the SAN fbrics hs filed such tht the opertionl system configurtion hs been reduced to single point of filure. v The error dt indictes port on the disk controller tht is still connected, nd lso lists controller ports tht re expected but tht re not connected. v A disk controller issue, switch hrdwre issue, zoning issue or cble fult might cuse this condition A disk controller is not ccessible from node. v A node hs detected tht it hs no ccess to disk controller. The controller is still ccessible from the prtner node in the I/O group, so its dt is still ccessible to the host pplictions. v The error dt indictes the WWPN of the missing disk controller. v A zoning issue or cbling error might cuse this condition A disk controller is not ccessible from node llowed to ccess the device by site policy v A disk controller is not ccessible from node tht is llowed to ccess the device by site policy. If disk controller hs multiple WWNNs, the disk controller my still be ccessible to the node through one of the other WWNNs. v The error dt indictes the WWNN of the inccessible disk controller. v A zoning issue or fibre chnnel connection hrdwre fult might cuse this condition. 1. Check the error ID nd dt for more detiled description of the error. 2. Determine if there hs been n intentionl chnge to the SAN zoning or to disk controller configurtion tht reduces the cluster's ccess to the indicted disk controller. If either ction hs occurred, continue with step Use the GUI or the CLI commnd lsfbric to ensure tht ll disk controller WWPNs re reported s expected. 4. Ensure tht ll disk controller WWPNs re zoned ppropritely for use by the cluster. 5. Check for ny unfixed errors on the disk controllers. 6. Ensure tht ll of the Fibre Chnnel cbles re connected to the correct ports t ech end. 7. Check for filures in the Fibre Chnnel cbles nd connectors. 8. When you hve resolved the issues, use the GUI or the CLI commnd detectmdisk to rescn the Fibre Chnnel network for chnges to the MDisks. Note: Do not ttempt to detect MDisks unless you re sure tht ll problems hve been fixed. Detecting MDisks premturely might msk n issue. Chpter 6. Dignosing problems 209

224 Mrk the error tht you hve just repired s fixed. The cluster will revlidte the redundncy nd will report nother error if there is still not sufficient redundncy. 10. Go to MAP 5700: Repir verifiction. v None 1630 The number of device logins ws reduced. Explntion: The number of port to port fbric connections, or logins, between the node nd storge controller hs decresed. This sitution might be cused by problem on the SAN or by deliberte reconfigurtion of the SAN. The 1630 error code is reported for number of different error IDs. The error ID indictes more specifics bout the problem. The dt reported in n event log entry indictes where the condition ws found Number of Device pths from the controller site llowed ccessible nodes hs reduced v The controller now hs fewer logins from the controller site tht llocted ccessible nodes to the storge controller. v The error dt indictes the WWNN or IP ddress of the disk controller, nd the current pth count from ech node. v A controller fult or Fibre Chnnel network fbric fult might cuse this condition. 1. Check the error in the cluster event log to identify the object ID ssocited with the error. 2. Check the vilbility of the filing device using the following commnd line: lscontroller object_id. If the commnd fils with the messge CMMVC6014E The commnd filed becuse the requested object is either unvilble or does not exist, sk the customer if this device ws removed from the system. v If yes, mrk the error s fixed in the cluster event log nd continue with the repir verifiction MAP. v If no or if the commnd lists detils of the filing controller, continue with the next step. 3. Check whether the device hs regined connectivity. If it hs not, check the cble connection to the remote-device port. 4. If ll ttempts to log in to remote-device port hve filed nd you cnnot solve the problem by chnging cbles, check the condition of the remote-device port nd the condition of the remote device. 5. Strt cluster discovery opertion by rescnning the Fibre Chnnel network. 6. Check the sttus of the disk controller. If ll disk controllers show good sttus, mrk the error tht you hve just repired s fixed. If ny disk controllers do not show good sttus, go to the strt MAP. If you return to this step, contct the support center to resolve the problem with the disk controller. 7. Go to repir verifiction MAP. v None Other: v Fibre Chnnel network fbric fult (50%) v Enclosure/controller fult (50%) 1656 Cloud ccount not vilble, encryption setting mismtch Explntion: The system encountered mismtch between cloud object storge nd cluster encryption stte. Cloud bckup services remin unvilble until this lert is fixed. The ssocited lert code gives more informtion Cloud ccount not vilble, cloud object storge encrypted The cloud object dt is encrypted nd the cluster cloud ccount is not configured with encryption enbled Cloud ccount not vilble, cloud object storge not encrypted The cloud dt is not encrypted nd the cluster cloud ccount is configured with encryption enbled. Ensure tht you specified the correct cloud ccount. If not, retry the commnd with the correct ccount. You cnnot chnge the encryption setting for the cloud ccount. If the specified cloud ccount is correct, you must delete the ccount by using the rmcloudccount commnd nd re-crete the ccount by using the mkcloudccount commnd, this time with n encryption setting tht mtches the setting for the cloud dt Cloud ccount not vilble, cloud object storge encrypted with the wrong key Explntion: The mster key tht is ssocited with the cloud dt does not mtch the cluster mster key tht ws used when the cluster cloud ccount ws creted. Cloud bckup services remin unvilble until this lert is fixed. The error code is ssocited with the following lert event: 210 SAN Volume Controller: Troubleshooting Guide

225 Cloud ccount not vilble, cloud object storge encrypted with the wrong key Compete the following steps: 1. Mke the correct mster key vilble in one of the following wys: v Insert USB drive tht contins the key v Ensure tht the system is ttched to Network Key Server tht contins the key. 2. Run the testcloudccount commnd. If the commnd completes with good sttus, mrk the error s fixed. 3. If the commnd does not complete with good sttus, contct your service support representtive The initiliztion of the mnged disk hs filed. Explntion: hs filed. The initiliztion of the mnged disk 1. View the event log entry to identify the mnged disk (MDisk) tht ws being ccessed when the problem ws detected. 2. Perform the disk controller problem determintion nd repir procedures for the MDisk identified in step Include the MDisk into the cluster. 4. Check the mnged disk sttus. If ll mnged disks show sttus of online, mrk the error tht you hve just repired s fixed. If ny mnged disks do not show sttus of online, go to the strt MAP. If you return to this step, contct your support center to resolve the problem with the disk controller. 5. Go to repir verifiction MAP. v None Other: Enclosure/controller fult (100%) 1670 The CMOS bttery on the system bord filed. Explntion: filed. vilble. The CMOS bttery on the system bord CMOS bttery (100%) Replce the node until the FRU is 1680 Drive fult type 1 Explntion: Drive fult type 1 Replce the drive. Drive (95%) Cnister (3%) Midplne (2%) 1684 Drive is missing. Explntion: Drive is missing. Instll the missing drive. The drive is typiclly dt drive tht ws previously prt of the rry. Drive (100%) 1686 Drive fult type 3. Explntion: Drive fult type 3. Complete the following steps to resolve this problem. 1. Reset the drive. 2. Replce the drive. 3. Replce the cnister s identified in the sense dt. 4. Replce the enclosure. Note: The removl of the exclusion on the drive slot will hppen utomticlly, but only fter this error hs been mrked s fixed. v Drive (46%) v Cnister (46%) v Enclosure (8%) 1689 Arry MDisk hs lost redundncy. Explntion: Arry MDisk hs lost redundncy. The RAID 5 system is missing dt drive. Drives removed or filed (100%) Replce the missing or filed drive No spre protection exists for one or more rry MDisks. Explntion: The system spre pool cnnot immeditely provide spre of ny suitbility to one or more rrys. Chpter 6. Dignosing problems 211

226 Configure n rry but no spres. 2. Configure mny rrys nd single spre. Cuse tht spre to be consumed or chnge its use. For distributed rry, unused or cndidte drives re converted into rry members. 1. Decode/explin the number of rebuild res vilble nd the threshold set. 2. Check for unfixed higher priority errors. 3. Check for unused nd cndidte drives tht re suitble for the distributed rry. Run the lsrrymembergols commnd to determine drive suitbility by using tech_type, cpcity, nd rpm informtion. v Offer to dd the drives into the rry. Allow up to the number of missing rry members to be dded. v Recheck fter rry members re dded. 4. If no drives re vilble, explin tht drives need to be dded to restore the wnted number of rebuild res. v If the threshold is greter thn the number of rebuild res vilble, nd the threshold is greter thn 1, offer to reduce the threshold to the number of drives tht re vilble A bckground scrub process hs found n inconsistency between dt nd prity on the rry. Explntion: The rry hs t lest one stride where the dt nd prity do not mtch. RAID hs found n inconsistency between the dt stored on the drives nd the prity informtion. This could either men tht the dt hs been corrupted, or tht the prity informtion hs been corrupted. Follow the directed mintennce procedure for inconsistent rrys Arry MDisk hs tken spre member tht does not mtch rry gols. Explntion: 1. A member of the rry MDisk either hs technology or cpbility tht does not mtch exctly with the estblished gols of the rry. 2. The rry is configured to wnt loction mtches, nd the drive loction does not mtch ll the loction gols. The error will fix itself utomticlly s soon s the rebuild or exchnge is queued up. It does not wit until the rry is showing blnced = exct (which indictes tht ll populted members hve exct cpbility mtch nd exct loction mtch) Drive exchnge required. Explntion: Drive exchnge required. Complete the following steps to resolve this problem. 1. Exchnge the filed drive. v Drive (100%) 1695 Persistent unsupported disk controller configurtion. Explntion: A disk controller configurtion tht might prevent filover for the cluster hs persisted for more thn four hours. The problem ws originlly logged through event, service error code Fix ny higher priority error. In prticulr, follow the service ctions to fix the 1625 error indicted by this error's root event. This error will be mrked s fixed when the root event is mrked s fixed. 2. If the root event cnnot be found, or is mrked s fixed, perform n MDisk discovery nd mrk this error s fixed. 3. Go to repir verifiction MAP. v None Other: v Enclosure/controller fult 1700 Unrecovered remote copy reltionship Explntion: This error might be reported fter the recovery ction for clustered system filure or complete I/O group filure. The error is reported becuse some remote copy reltionships, whose control dt is stored by the I/O group, could not be recovered. To fix this error it is necessry to delete ll of the reltionships tht might not be recovered, nd then re-crete the reltionships. 1. Note the I/O group index ginst which the error is logged. 2. List ll of the reltionships tht hve either mster or n uxiliry volume in this I/O group. Use the volume view to determine which volumes in the I/O group you noted hve reltionship tht is defined. 3. Note the detils of the reltionships tht re listed so tht they cn be re-creted. If the ffected I/O group hs ctive-ctive reltionships tht re in consistency group, run the commnd chrcreltionship -noconsistgrp 212 SAN Volume Controller: Troubleshooting Guide

227 1710 rc_rel_nme for ech ctive-ctive reltionship tht ws not recovered. Then, use the commnd lsrcreltioship in cse volume lbels re chnged nd to see the vlue of the primry ttributes. 4. Delete ll of the reltionships tht re listed in step 2, except ny ctive-ctive reltionship tht hs host pplictions tht use the uxiliry volume vi the mster volume unique ID. (tht is, the primry ttribute vlue is uxiliry in the output from lsrcreltionship). For the ctive-ctive reltionships tht hve the primry ttribute vlue of uxiliry, use the rmvolumecopy CLI commnd (which lso deletes the reltionship). For exmple, rmvolumecopy mster_volume_id/nme. Note: The error is utomticlly mrked s fixed once the lst reltionship on the I/O group is deleted. New reltionships must not be creted until the error is fixed. 5. Re-crete ll the reltionships tht you deleted by using the detils noted in step 3. Note: For Metro Mirror nd Globl Mirror reltionships, you re ble to delete reltionship from either the mster or uxiliry system; however, you must re-crete the reltionship on the mster system. Therefore, it might be necessry to go to nother system to complete this service ction. v None 1710 There re too mny cluster prtnerships. The number of cluster prtnerships hs been reduced. Explntion: A cluster cn hve Metro Mirror nd Globl Mirror cluster prtnership with one or more other clusters. Prtnership sets consist of clusters tht re either in direct prtnership with ech other or re in indirect prtnership by hving prtnership with the sme intermedite cluster. The topology of the prtnership set is not fixed; the topology might be str, loop, chin or mesh. The mximum supported number of clusters in prtnership set is four. A cluster is member of prtnership set if it hs prtnership with nother cluster in the set, regrdless of whether tht prtnership hs ny defined consistency groups or reltionships. These re exmples of vlid prtnership sets for five unique clusters lbelled A, B, C, D, nd E where prtnership is indicted by dsh between two cluster nmes: v A-B, A-C, A-D. E hs no prtnerships defined nd therefore is not member of the set. v A-B, A-D, B-C, C-D. E hs no prtnerships defined nd therefore is not member of the set. v A-B, B-C, C-D. E hs no prtnerships defined nd therefore is not member of the set. v A-B, A-C, A-D, B-C, B-D, C-D. E hs no prtnerships defined nd therefore is not member of the set. v A-B, A-C, B-C. D-E. There re two prtnership sets. One contins clusters A, B, nd C. The other contins clusters D nd E. These re exmples of unsupported configurtions becuse the number of clusters in the set is five, which exceeds the supported mximum of four clusters: v A-B, A-C, A-D, A-E. v A-B, A-D, B-C, C-D, C-E. v A-B, B-C, C-D, D-E. The cluster prevents you from creting new Metro Mirror nd Globl Mirror cluster prtnership if resulting prtnership set would exceed the mximum of four clusters. However, if you restore broken link between two clusters tht hve prtnership, the number of clusters in the set might exceed four. If this occurs, Metro Mirror nd Globl Mirror cluster prtnerships re excluded from the set until only four clusters remin in the set. A cluster prtnership tht is excluded from set hs ll of its Metro Mirror nd Globl Mirror cluster prtnerships excluded. Event ID 0x is reported if the cluster is retined in the prtnership set. Event ID 0x is reported if the cluster is excluded from the prtnership set. All clusters tht were in the prtnership set report error All inter-cluster Metro Mirror or Globl Mirror reltionships tht involve n excluded cluster will lose connectivity. If ny of these reltionships re in the consistent_synchronized stte nd they receive write I/O, they will stop with error code To fix this error it is necessry to delete ll of the reltionships tht could not be recovered nd then re-crete the reltionships. 1. Determine which clusters re still connected nd members of the prtnership set, nd which clusters hve been excluded. 2. Determine the Metro Mirror nd Globl Mirror reltionships tht exist on those clusters. 3. Determine which of the Metro Mirror nd Globl Mirror reltionships you wnt to mintin, which determines which cluster prtnerships you wnt to mintin. Ensure tht the prtnership set or sets tht would result from configuring the cluster prtnerships tht you wnt contin no more thn four clusters in ech set. NOTE: The reduced prtnership set creted by the cluster might not contin the clusters tht you wnt in the set. 4. Remove ll of the Metro Mirror nd Globl Mirror reltionships tht you do not wnt to retin. Chpter 6. Dignosing problems 213

228 Remove ll of the Metro Mirror nd Globl Mirror cluster prtnerships tht you do not wnt to retin. 6. Restrt ll reltionships nd consistency groups tht were stopped. 7. Go to repir verifiction MAP. v None 1720 Metro Mirror (remote copy) - Reltionship hs stopped nd lost synchroniztion, for reson other thn persistent I/O error (LSYNC) Explntion: A remote copy reltionship or consistency group needs to be restrted. In Metro Mirror (remote copy) or Globl Mirror opertion, the reltionship hs stopped nd lost synchroniztion, for reson other thn persistent I/O error. The dministrtor must exmine the stte of the system to vlidte tht everything is online to llow restrt to work. Exmining the stte of the system lso requires checking the prtner Fibre Chnnel (FC) port msks on both clusters. 1. If the prtner FC port msk ws chnged recently, check tht the correct msk ws selected. 2. Perform whtever steps re needed to mintin consistent secondry volume, if desired. 3. The dministrtor must issue strt commnd. v None 1740 Recovery encryption key not vilble. Explntion: vilble. Recovery encryption key is not vilble. 1. If the key is not vilble: Mke the recovery encryption key v Instll USB drive with the encryption key. v Ensure correct file is on the USB drive. 2. If the key is not vlid: v Get USB drive with vlid key for this MTMS. The key does not hve vlid CRC. No FRU 1741 Flsh module is predicted to fil. Explntion: The Flsh module is predicted to fil due to low helth (event ID ) or due to n encryption issue (event ID ). In either cse, the drive should be replced. A replcement drive of the sme size is needed to correct this error. If ny higher rry events exist, correct those first. If no other rry events exist, replce the drive. If the rry is RAID5, replce nd formt the drive. If the rry is RAID0, correcting this issue will result in loss of ll dt. If the dt is needed, do the following: 1. Bckup ll rry dt. 2. Replce the drives using the recoverrry formt. 3. Restore rry dt. If the rry dt is not needed, replce the drive(s) using the recoverrry formt Arry response time too high. Explntion: A number of cuses cn led to higher-thn-usul rry response time. 1. Fix higher priority errors first. 2. Fix ny other known errors. 3. Chnge the rry into redundncy mode by using the chrry interfce. Environment or configurtion issues: Volume config 30% Slow drive 30% Enclosure 20% SAS port 20% 1780 Encryption key chnges re not committed. Explntion: Chnges were mde to the encryption key, but the pending chnges were not committed. A directed mintennce procedure (DMP) ws lunched to cncel the chnges. Press Next to cncel the pending key chnges. Lunch the GUI to restrt the opertion A problem occurred with the Key Server Explntion: The mening of the error code depends on the ssocited event code. All of these errors involve the key server vlidtion process, which cn be triggered by the mkkeyserver, chkeyserver, or testkeyserver commnds, or by the regulr vlidtion timer Key Server reported KMIP error 214 SAN Volume Controller: Troubleshooting Guide

229 1785 While key server vlidtion ws running, the server reported nonzero KMIP error code. Becuse the key server cn report wide rnge of KMIP error codes, the sense dt includes the following dditionl informtion bout the error: v KMIP Error Code v KMIP Result Sttus v KMIP Result Reson v An error string tht contins the KMIP Result Messge Key Server reported vendor informtion error While key server vlidtion ws running, the server reported one of the following conditions: v Unsupported type of key server v Unsupported code level on the key server Filed to connect to Key Server While key server vlidtion ws running, the node ws unble to connect to the key server Key Server reported misconfigured primry An SKLM key server reported server type tht conflicted with the vlue defined on the system. The key server reported it is not the primry, but the server is defined to be the primry on the system. For event code : 1. The key server reported server-side problem. The sense dt of this event includes more detils to help pinpoint the problem on the key server. Run the testkeyserver commnd to determine whether the problem is fixed. The testkeyserver commnd either utomticlly fixes the error, or rises the event gin. 2. Check tht the cluster certificte ws ccepted on the key server. For more informtion, serch your product documenttion for "Certifictes tht re used for key servers". 3. Ensure tht ISKLM hs been configured to use TLS v1.2. Filure to do so cn cuse n SSL connection error. For event code : 1. The key server reported tht it is running n unsupported softwre version. Verify tht you re using the correct key server nd tht the IP ddress, port ddress, nd other chrcteristics re ll correct. If not, use the chkeyserver commnd to chnge this informtion. The chkeyserver commnd utomticlly strts the vlidtion process to confirm tht the error is fixed, nd either uto-fixes this event or rises it gin. 2. Verify tht you re using supported key server type nd version. A list of supported key servers is provided in the documenttion. The sense dt of this event includes the version informtion reported by the key server. v v v The minimum supported version of Key Mngement Interoperbility Protocol (KMIP) is 1.3. The supported key server type is ISKLM only. The supported versions of ISKLM re nd lter. For event code : 1. Check tht service IP ddress is configured for ll nodes in the cluster (IPv4 if you use IPv4 key servers, IPv6 if you use IPv6 key servers). If not, configure these IP ddresses nd run the testkeyserver commnd. If the testkeyserver commnd is successful, the event is utomticlly fixed. 2. Confirm tht ll nodes in the cluster hve their Ethernet cble plugged in correctly. If not, plug them in nd run the testkeyserver commnd. If the testkeyserver commnd is successful, the event is utomticlly fixed. 3. Confirm tht the IP ddress nd IP port of the key server object is correct. If not, chnge the key server detils by using the chkeyserver commnd. The chkeyserver commnd utomticlly strts the vlidtion process to confirm tht the error is fixed, nd either uto-fixes this event or rises it gin. 4. Confirm tht ny SSL certifictes for the key server re vlid. Certifictes must hve correct strt nd end dtes nd must be in the PEM formt. For event code : 1. Run the lskeyserver commnd to show the current sttus of the key servers. One of these servers hs the primry field incorrectly set to yes. 2. Determine which server should correctly be designted s primry. Do this on the server side by identifying the IP ddress nd port tht points to the rel primry server. The primry server hs the role of "MASTER" in the repliction reltionship in SKLM. For more informtion bout this process, refer to your SKLM documenttion. If the primry server in the lskeyserver commnd ppers to be correct, contct your service support representtive. 3. Otherwise, run the following commnd: chkeyserver -primry server_id where server_id is the ID of the correct primry server. 4. The chkeyserver commnd utomticlly vlidtes the new primry key server. To fix the event, complete one of the following ctions: v Mnully mrk the event s fixed by using the cheventlog -fix commnd Chpter 6. Dignosing problems 215

230 v Wit for the periodic vlidtion of the old primry key server v Mnully vlidte the old server by using the testkeyserver commnd If the problem persists, contct your service support representtive The SAN hs been zoned incorrectly. Explntion: This hs resulted in more thn 512 other ports on the SAN logging into one port of 2145 node. 1. Ask the user to reconfigure the SAN. 2. Mrk the error s fixed. 3. Go to repir verifiction MAP. v None Other: v Fibre Chnnel (FC) switch configurtion error v FC switch 1801 A node hs received too mny Fibre Chnnel logins from nother node. Explntion: This event ws logged becuse the node hs received more thn sixteen Fibre Chnnel logins originting from nother node. This indictes tht the Fibre Chnnel storge re network tht connects the two nodes is not correctly configured. Dt: v None Chnge the zoning nd/or Fibre Chnnel port msking so tht no more thn 16 logins re possible between pir of nodes. See Non-criticl node error 888 on pge 176 for detils. Use the lsfbric commnd to view the current number of logins between nodes. Possible Cuse-FRUs or other cuse: v None 1802 Fibre Chnnel network settings Explntion: Fibre Chnnel network settings Follow these troubleshooting steps to reduce the number of hosts tht re logged in to the port: 1. Increse the grnulrity of the switch zoning to reduce unnecessry host port logins. 2. Chnge switch zoning to spred out host ports cross other vilble ports. 3. Use interfces with more ports, if not lredy t the mximum. 4. Scle out by using nother FlshSystem enclosure. No FRU 1804 IB network settings Explntion: IB network settings Follow these troubleshooting steps to reduce the number of hosts tht re logged in to the port: 1. Increse the grnulrity of the switch zoning to reduce unnecessry host port logins. 2. Chnge switch zoning to spred out host ports cross other vilble ports. 3. Use interfces with more ports, if not lredy t the mximum. 4. Scle out by using nother FlshSystem enclosure. No FRU 1810 The bre metl server which runs SV_Cloud lost 1 power supply Explntion: One of the two power supplies for the bre metl server tht runs the IBM Spectrum Virtulize for Public Cloud softwre is not functioning. If the other power supply fils, you might lose the contents of the volume cche. To prevent this problem, complete one of the following ctions: v Turn off the IBM Spectrum Virtulize for Public Cloud softwre on the bre server. This forces the volumes in tht I/O group to run in write-through mode, so no customer dt is cched on the server. When the softwre stops, the cche is flushed to bckend storge. v Use the chvdisk to disble the cche for ech volume in the I/O group. No customer dt will be cched, so no dt is lost if the second power supply fils Node IP missing Explntion: No IP ddresses were found for node in the system. Complete the following steps: 1. Run the sinfo lsnodeip commnd to determine the port tht hs no IP ddresses. 2. Run the stsk chnodeip commnd to set node IP ddresses. Configure t lest two node IP ddresses. 216 SAN Volume Controller: Troubleshooting Guide

231 The connection between one pir of nodes is disconnected. Explntion: A node is disconnected. Complete the following steps: 1. Run the lseventlog sequence_number commnd nd note the vlues for the following ttributes: reporting_node_id The ID for the node tht reported the error. sense Among the other sense dt, locte the destintion_ip, which is the IP ddress of the disconnected node. object_id The port ID for the connection. 2. Run the following commnd: sinfo lsnodeip Note the node IP ddress, which is in sme row with the port ID from the previous step. 3. As superuser, ping the disconnected node from the reporting node: ping -srcip4 --reporting_ip destintion_ip 4. If the ping is successful, contct your support representtive. If the ping fils, look for n issue with the network or with the IP configurtion Node identity chnged Explntion: The ID of the node ws chnged. Consult logs nd the history of opertions for the system to see if vlid reson exists for the chnge. If not, investigte the possibility of security brech. You might wnt to chnge the bckend storge psswords The mnged disk hs bd blocks. Explntion: These re "virtul" medium errors which re creted when copying volume where the source hs medium errors. During dt moves or dupliction, such s during flsh copy, n ttempt is mde to move medium errors; to chieve this, virtul medium errors clled bd blocks re creted. Once bd block hs been creted, no ttempt will be mde to red the underlying dt, s there is no gurntee tht the old dt still exists once the bd block is creted. Therefore, it is possible to hve bd blocks, nd thus medium errors, reported on trget volume, without medium errors ctully existing on the underlying storge. The bd block records re removed when the dt is overwritten by host. Note: On n externl controller, this error cn only result from copied medium error. 1. The support center will direct the user to restore the dt on the ffected volumes. 2. When the volume dt hs been restored, or the user hs chosen not to restore the dt, mrk the error s fixed. 3. Go to repir verifiction MAP. v None 1850 Compressed volume copy hs bd blocks Explntion: A system recovery opertion ws performed, but dt on one or more volumes ws not recovered; this is normlly cused by combintion of hrdwre fults. If dt contining medium error is copied or migrted to nother volume, bd blocks will be recorded. If host ttempts to red the dt in ny of the bd block regions, the red will fil with medium error. 1. The support center will direct the user to restore the dt on the ffected volumes. 2. When the volume dt hs been restored, or the user hs chosen not to restore the dt, mrk the error s fixed. 3. Go to repir verifiction MAP. v None 1860 Thin-provisioned volume copy offline becuse of filed repir. Explntion: The ttempt to repir the metdt of thin-provisioned volume tht describes the disk contents hs filed becuse of problems with the utomticlly mintined bckup copy of this dt. The error event dt describes the problem. Delete the thin-provisioned volume nd reconstruct new one from bckup or mirror copy. Mrk the error s fixed. Also mrk the originl 1862 error s fixed. v None 1862 Thin-provisioned volume copy offline becuse of corrupt metdt. Explntion: A thin-provisioned volume hs been tken offline becuse there is n inconsistency in the cluster metdt tht describes the disk contents. This might occur becuse of corruption of dt on the physicl disk (e.g., medium error or dt miscompre), the loss of cched metdt (becuse of cluster recovery) or becuse of softwre error. The event dt Chpter 6. Dignosing problems 217

232 gives informtion on the reson. The cluster mintins bckup copies of the metdt nd it might be possible to repir the thin-provisioned volume using this dt. The cluster is ble to repir the inconsistency in some circumstnces. Run the repir volume option to strt the repir process. This repir process, however, cn tke some time. In some situtions it might be more pproprite to delete the thin-provisioned volume nd reconstruct new one from bckup or mirror copy. If you run the repir procedure nd it completes, this error is utomticlly mrked s fixed ; otherwise, nother error event (error code 1860) is logged to indicte tht the repir ction hs filed. v None 1864 Compressed volume size limittion breched, dignosis required Explntion: The system indictes tht the virtul or rel cpcity of t lest one compressed volume exceeds the system limits. For informtion bout how to del with this issue, see docview.wss?uid=ssg1s Thin-provisioned volume copy offline becuse of insufficient spce. Explntion: A thin-provisioned volume hs been tken offline becuse there is insufficient llocted rel cpcity vilble on the volume for the used spce to increse further. If the thin-provisioned volume is uto-expnd enbled, then the storge pool it is in lso hs no free spce. The service ction differs depending on whether the thin-provisioned volume copy is uto-expnd enbled or not. Whether the disk is uto-expnd enbled or not is indicted in the error event dt. If the volume copy is uto-expnd enbled, perform one or more of the following ctions. When you hve performed ll of the ctions tht you intend to perform, mrk the error s fixed ; the volume copy will then return online. v Determine why the storge pool free spce hs been depleted. Any of the thin-provisioned volume copies, with uto-expnd enbled, in this storge pool might hve expnded t n unexpected rte; this could indicte n ppliction error. New volume copies might hve been creted in, or migrted to, the storge pool. v Increse the cpcity of the storge pool tht is ssocited with the thin-provisioned volume copy by dding more MDisks to the storge pool. v Provide some free cpcity in the storge pool by reducing the used spce. Volume copies tht re no longer required cn be deleted, the size of volume copies cn be reduced or volume copies cn be migrted to different storge pool. v Migrte the thin-provisioned volume copy to storge pool tht hs sufficient unused cpcity. v Consider reducing the vlue of the storge pool wrning threshold to give more time to llocte extr spce. If the volume copy is not uto-expnd enbled, perform one or more of the following ctions. In this cse the error will utomticlly be mrked s fixed, nd the volume copy will return online when spce is vilble. v Determine why the thin-provisioned volume copy used spce hs grown t the rte tht it hs. There might be n ppliction error. v Increse the rel cpcity of the volume copy. v Enble uto-expnd for the thin-provisioned volume copy. v Consider reducing the vlue of the thin-provisioned volume copy wrning threshold to give more time to llocte more rel spce. v None 1870 Mirrored volume offline becuse hrdwre red error hs occurred. Explntion: While ttempting to mintin the volume mirror, hrdwre red error occurred on ll of the synchronized volume copies. The volume copies might be inconsistent, so the volume is now offline. v Fix ll higher priority errors. In prticulr, fix ny red errors tht re listed in the sense dt. This error event will utomticlly be fixed when the root event is mrked s fixed. v If you cnnot fix the root error, but the red errors on some of the volume copies hve been fixed, mrk this error s fixed to run without the mirror. You cn then delete the volume copy tht cnnot red dt nd re-crete it on different MDisks. v None 218 SAN Volume Controller: Troubleshooting Guide

233 Unrecovered FlshCopy mppings Explntion: This error might be reported fter the recovery ction for cluster filure or complete I/O group filure. The error is reported becuse some FlshCopies, whose control dt is stored by the I/O group, were ctive t the time of the filure nd the current stte of the mpping could not be recovered. To fix this error it is necessry to delete ll of the FlshCopy mppings on the I/O group tht filed. 1. Note the I/O group index ginst which the error is logged. 2. List ll of the FlshCopy mppings tht re using this I/O group for their bitmps. You should get the detiled view of every possible FlshCopy ID. Note the IDs of the mppings whose IO_group_id mtches the ID of the I/O group ginst which this error is logged. 3. Note the detils of the FlshCopy mppings tht re listed so tht they cn be re-creted. 4. Delete ll of the FlshCopy mppings tht re listed. Note: The error will utomticlly be mrked s fixed once the lst mpping on the I/O group is deleted. New mppings cnnot be creted until the error is fixed. 5. Using the detils noted in step 3, re-crete ll of the FlshCopy mppings tht you just deleted. v None 1900 A FlshCopy, Trigger Prepre commnd hs filed becuse cche flush hs filed. Explntion: A FlshCopy, Trigger Prepre commnd hs filed becuse cche flush hs filed. 1. Correct higher priority errors, nd then try the Trigger Prepre commnd gin. 2. Mrk the error tht you hve just repired s fixed. 3. Go to repir verifiction MAP. v None Other: Cche flush error (100%) 1910 A FlshCopy mpping tsk ws stopped becuse of the error tht is indicted in the sense dt. Explntion: A stopped FlshCopy might ffect the sttus of other volumes in the sme I/O group. Prepring the stopped FlshCopy opertions s soon s possible is dvised. 1. Correct higher priority errors, nd then prepre nd strt the FlshCopy tsk gin. 2. Mrk the error tht you hve just repired s fixed. 3. Go to repir verifiction MAP. v None 1920 Globl nd Metro Mirror persistent error. Explntion: This error might be cused by problem on the primry system, problem on the secondry system, or problem on the intersystem link. The problem might be filure of component, component becoming unvilble or hving reduced performnce becuse of service ction, or it might be tht the performnce of component dropped to level where the Metro Mirror or Globl Mirror reltionship cnnot be mintined. Alterntively the error might be cused by chnge in the performnce requirements of the pplictions tht re using Metro Mirror or Globl Mirror. This error is reported on the primry system when the copy reltionship hs not progressed sufficiently over period. Therefore, if the reltionship is restrted before ll of the problems re fixed, the error might be reported gin when the time period next expires (the defult period is 5 minutes). This error might lso be reported becuse the primry system encountered red errors. You might need to refer to the Copy Services fetures informtion in the softwre instlltion nd configurtion documenttion while you dignose this error. 1. If the 1920 error occurred previously on Metro Mirror or Globl Mirror between the sme systems nd ll the following ctions were ttempted, contct your product support center to resolve the problem. 2. On both systems, check the prtner Fibre Chnnel port msk to ensure tht sufficient connectivity is vilble. If the prtner Fibre Chnnel port msk ws chnged recently, ensure tht the msk is correct. Chpter 6. Dignosing problems 219

234 On the primry system tht is reporting the error, correct ny higher priority errors. 4. On the secondry system, review the mintennce logs to determine whether the system ws operting with reduced cpbility t the time the error ws reported. The reduced cpbility might be becuse of softwre upgrde, hrdwre mintennce to node, mintennce to bckend disk system or mintennce to the SAN. 5. On the secondry system, correct ny errors tht re not fixed. 6. On the intersystem link, review the logs of ech link component for ny incidents tht would cuse reduced cpbility t the time of the error. Ensure tht the problems re fixed. 7. If reson for the error ws found nd corrected, go to Action On the primry system tht is reporting the error, exmine the sttistics by using SAN productivity monitoring tool nd confirm tht ll the Metro Mirror nd Globl Mirror requirements tht re described in the plnning documenttion re met. Ensure tht ny chnges to the pplictions tht use Metro Mirror or Globl Mirror re ccounted for. Resolve ny issues. 9. On the secondry system, exmine the sttistics by using SAN productivity monitoring tool nd confirm tht ll the Metro Mirror nd Globl Mirror requirements tht re described in the softwre instlltion nd configurtion documenttion re met. Resolve ny issues. 10. On the intersystem link, exmine the performnce of ech component by using n pproprite SAN productivity monitoring tool to ensure tht they re operting s expected. Resolve ny issues. 11. Mrk the error s fixed nd restrt the Metro Mirror or Globl Mirror reltionship. When you restrt the Metro Mirror or Globl Mirror reltionship, there is n initil period during which Metro Mirror or Globl Mirror performs bckground copy to resynchronize the volume dt on the primry nd secondry systems. During this period, the dt on the Metro Mirror or Globl Mirror uxiliry volumes on the secondry system is inconsistent nd the volumes cnnot be used s bckup disks by your pplictions. Note: To ensure tht the system hs the cpcity to hndle the bckground copy lod, you might wnt to dely restrting the Metro Mirror or Globl Mirror reltionship until there is quiet period when the secondry system nd the SAN fbric (including the intersystem link) hve the required cpcity. If the required cpcity is not vilble, you might experience nother 1920 error nd the Metro Mirror or Globl Mirror reltionship stops in n inconsistent stte. Note: If the Metro Mirror or Globl Mirror reltionship stopped in consistent stte ( consistent-stopped ), it is possible to use the dt on the Metro Mirror or Globl Mirror uxiliry volumes on the secondry system s bckup disks by your pplictions. Therefore, you might wnt to strt FlshCopy of your Metro Mirror or Globl Mirror uxiliry disks on the secondry system before you restrt the Metro Mirror or Globl Mirror reltionship. This mens tht you mintin the current, consistent, imge until the time when the Metro Mirror or Globl Mirror reltionship is gin synchronized nd in consistent stte. v None Other: v Primry system or SAN fbric problem (10%) v Primry system or SAN fbric configurtion (10%) v Secondry system or SAN fbric problem (15%) v Secondry system or SAN fbric configurtion (25%) v Intersystem link problem (15%) v Intersystem link configurtion (25%) 1925 Cched dt cnnot be destged. Explntion: Problem dignosis is required. 1. Run the directed mintennce procedure to fix ll errors of higher priority. This will llow the cched dt to be destged nd the originting event to be mrked fixed. v None 1930 Migrtion suspended. Explntion: Migrtion suspended. 1. Ensure tht ll error codes of higher priority hve lredy been fixed. 2. Ask the customer to ensure tht ll storge pools tht re the destintion of suspended migrte opertions hve vilble free extents. 3. Mrk this error s fixed. This cuses the migrte opertion to be restrted. If the restrt fils, new error is logged. 4. Go to repir verifiction MAP. v None 220 SAN Volume Controller: Troubleshooting Guide

235 HyperSwp volume or consistency group hs lost synchroniztion between sites. Explntion: HyperSwp volume or consistency group hs lost synchroniztion between sites. Complete the following steps to resolve this problem. 1. Check the event log for ny higher priority unfixed errors. 2. HyperSwp volumes will utomticlly resynchronize when the underlying problem hs been resolved. v N/A 2008 A softwre downgrde hs filed. Explntion: Cluster configurtion chnges re restricted until the downgrde is completed. The cluster downgrde process wits for user intervention when this error is logged. The ction required to recover from stlled downgrde depends on the current stte of the cluster being downgrded. Cll IBM Support for n ction pln to resolve this problem. v None Other: System softwre (100%) 1950 Unble to mirror medium error. Explntion: During the synchroniztion of mirrored volume copy it ws necessry to duplicte the record of medium error onto the volume copy, creting virtul medium error. Ech mnged disk hs tble of virtul medium errors. The virtul medium error could not be creted becuse the tble is full. The volume copy is in n inconsistent stte nd hs been tken offline. Three different pproches cn be tken to resolving this problem: 1) the source volume copy cn be fixed so tht it does not contin medium errors, 2) the number of virtul medium errors on the trget mnged disk cn be reduced or 3) the trget volume copy cn be moved to mnged disk with more free virtul medium error entries. The mnged disk with full medium error tble cn be determined from the dt of the root event. Approch 1) - This is the preferred procedure becuse it restores the source volume copy to stte where ll of the dt cn be red. Use the norml service procedures for fixing medium error (rewrite block or volume from bckup or regenerte the dt using locl procedures). Approch 2) - This method cn be used if the mjority of the virtul medium errors on the trget mnged disk do not relte to the volume copy. Determine where the virtul medium errors re using the event log events nd re-write the block or volume from bckup. Approch 3) - Delete the offline volume copy nd crete new one either forcing the use of different MDisks in the storge pool or using completely different storge pool. Follow your selection option(s) nd then mrk the error s fixed. v None 2010 A softwre updte hs filed. Explntion: Cluster configurtion chnges re restricted until the updte is completed or rolled bck. The cluster updte process wits for user intervention when this error is logged. The ction required to recover from stlled updte depends on the current stte of the cluster being updted. Cll IBM technicl support for n ction pln to resolve this problem. v None Other: System softwre (100%) 2016 A host port hs more thn four logins to node Explntion: More thn 4 logins hve been mde to t lest one host port or WWPN on t lest one node. The network might not be zoned correctly. Complete the following steps. If t ny point you need dditionl ssistnce, contct your service support representtive. 1. Crete list of the problem hosts, WWPNs, nd nodes:. Run the svcinfo lsfbric -host commnd nd prse the output into humn redble formt. b. Sort by WWPN, then by node. c. For ny WWPN nd node combintion tht shows more thn 4 logins: 1) Get the host port msk from the msk field of the lshost detiled view. 2) Ignore ny row where the locl_port field does not mtch the pproprite bit in the host port msk. Chpter 6. Dignosing problems 221

236 ) Mke note of ny hosts tht still show more thn 4 logins fter the host port msk is pplied. 2. Fix the issue either by chnging the zoning or by chnging the host port msk. 3. The event will uto-fix when ll of the host ports hve login counts of 4 or less on every node IP Remote Copy link unvilble. Explntion: IP Remote Copy link is unvilble. Fix the remote IP link so tht trffic cn flow correctly. Once the connection is mde, the error will uto-correct Prtner cluster IP ddress unrechble. Explntion: Prtner cluster IP ddress unrechble. 1. Verify the system IP ddress of the remote system forming the prtnership. 2. Check if remote cluster IP ddress is rechble from locl cluster. The following cn be done to verify ccessibility:. Use svctsk to ping the remote cluster IP ddress. If the ping works, there my be block on the specific port trffic tht needs to be opened in the network. If the ping does not work, there my be no route between the system. Check the IP gtewy configurtion on the system nodes nd the IP network configurtion. b. Check the configurtion of the routers nd firewll to ensure tht TCP/IP port 3620 used for IP prtnership is not blocked. c. Use the ssh commnd from nother system to ttempt to estblish session with the problemtic remote cluster IP ddress to confirm tht the remote cluster is opertionl Cnnot uthenticte with prtner cluster. Explntion: Cnnot uthenticte with prtner cluster. Verify the CHAP secret set of prtnership using mkipprtnership or chprtnership CLIs mtch remote system CHAP secret set using chsystem CLI. If they don't mtch, use pproprite commnds to set the right CHAP secrets Unexpected cluster ID for prtner cluster. Explntion: Unexpected cluster ID for prtner cluster. After deleting ll reltionships nd consistency group, remove the prtnership. This is n unrecoverble error when one of the sites hs undergone T3 recovery nd lost ll prtnership informtion. Contct IBM support Softwre error. Explntion: The softwre hs restrted becuse of problem in the cluster, on disk system or on the Fibre Chnnel fbric. 1. Collect the softwre dump file(s) generted t the time the error ws logged on the cluster. 2. Contct your product support center to investigte nd resolve the problem. 3. Ensure tht the softwre is t the ltest level on the cluster nd on the disk systems. 4. Use the vilble SAN monitoring tools to check for ny problems on the fbric. 5. Mrk the error tht you hve just repired s fixed. 6. Go to repir verifiction Mp. v Your support center might indicte FRU bsed on their problem nlysis (2%) Other: v Softwre (48%) v Enclosure/controller softwre (25%) v Fibre Chnnel switch or switch configurtion (25%) 2031 Cloud gtewy service restrted Explntion: The system detected tht n error occurred with the cloud gtewy service nd the service ws restrted. Try the following ctions: 1. Check the IP network. For exmple, ensure tht ll network switches report good sttus. 2. Updte the system to the ltest code. 3. If the problem persists, contct your service support representtive Drive hs disbled protection informtion support. Explntion: An rry hs been interrupted in the process of estblishing dt integrity protection informtion on or more of its members by initil writes or rebuild writes. In order to ensure the rry is usble, the system hs turned off hrdwre dt protection for the member drive. If mny or ll the member drives in n rry hve logged this error, nd sufficient storge exists in the pool to migrte the llocted extents, then 222 SAN Volume Controller: Troubleshooting Guide

237 the simplest strtegy is to delete the rry nd recrete it once the drive service ction hs been ccomplished. If smll number of drives re ffected then it is simplest to remove these drives from the rry nd service them individully. This option is not possible if the rry is currently syncing post recovery A softwre updte is required. Explntion: The softwre cnnot determine the VPD for FRU. Probbly, new FRU hs been instlled nd the softwre does not recognize tht FRU. 1. If FRU hs been replced, ensure tht the correct replcement prt ws used. The node VPD indictes which prt is not recognized. 2. Ensure tht the cluster softwre is t the ltest level. 3. Sve dump dt with configurtion dump nd logged dt dump. 4. Contct your product support center to resolve the problem. 5. Mrk the error tht you hve just repired s fixed. 6. Go to repir verifiction MAP. v None Other: System softwre (100%) 2055 System reboot required. Explntion: A system restrt is required. The softwre updte is not complete. Restrt the system. The system is not vilble for I/O or systems mngement during the system reset Mnul dischrge of btteries required. Explntion: Mnul dischrge of btteries required. Use chenclosureslot -bttery -slot 1 -recondition on to cuse bttery clibrtion. Drive (100%) 2100 A softwre error hs occurred. Explntion: One of the V3700 server softwre components (sshd, crond, or httpd) hs filed nd reported n error. 1. Ensure tht the softwre is t the ltest level on the cluster. 2. Sve dump dt with configurtion dump nd logged dt dump. 3. Contct your product support center to resolve the problem. 4. Mrk the error tht you hve just repired s fixed. 5. Go to repir verifiction MAP. v None Other: V3700 softwre (100%) 2105 Cloud ccount not vilble, cnnot ccess cloud object storge Explntion: The system encountered problem in trying to red, write, or serch for dt in the cloud object storge. Try the following ctions: 1. Mrk the error s fixed to retry the opertion. 2. Check the cloud provider console for errors, if vilble. 3. Report the problem to the cloud provider. Include the following informtion: v Check the sense dt to determine whether the system ws ttempting to red, write, or serch. v Reconstruct the continer nme from the continer prefix in the cloud ccount object, nd the continer suffix in the sense dt. v Check the sense dt to lern the BLOB nme tht the system ws working with A drive hs been detected in n enclosure tht does not support tht drive. Explntion: A drive hs been detected in n enclosure tht does not support tht drive. Remove the drive. If the result is n invlid number of drives, replce the drive with vlid drive Performnce of externl MDisk hs chnged Explntion: The system identified chnge in the performnce ctegory of n externl MDisk. A storge device in the externl system might hve been replced with device tht hs different performnce chrcteristics to the originl. The ID of the MDisk is logged in the event (Bytes 5-8 of the sense dt). It might be necessry to re-configure the tier of the Chpter 6. Dignosing problems 223

238 MDisk so tht EsyTier mkes best use of the storge. Run the fix procedure for this event, ssisting you with the following tsks: 1. Run the Detect MDisks tsk, so tht the system determines the current performnce ctegory of ech Mdisk. When the detection tsk is complete, if performnce hs reverted, the event is utomticlly mrked s fixed. 2. If the event is not utomticlly fixed, you cn chnge the tier of the MDisk to the recommended tier shown in the event properties. The recommended tier is logged in the event (Bytes 9-13 of the sense dt. A vlue of 10 hex indictes flsh tier, vlue of 20 hex indictes enterprise tier). 3. If you choose not to chnge the tier configurtion, mrk the event s fixed Internl IO error occurred while doing cloud opertion. Explntion: An internl error occurred while the system ws trying to crete cloud snpshot or complete restore opertion. More informtion is provided by the ssocited lert event: v Internl Red error during cloud snpshot opertion v Internl write error during cloud snpshot opertion Complete the following steps: 1. Fix for ny unfixed errors on the volume where the error ws reported or on the volume tht ws being restored. To determine the nme of the volume tht ws being restored, use the lsvolumerestoreprogress commnd. 2. Mrk the error s fixed to hve the system retry the opertion. 3. If the error persists, contct your service support representtive Cloud ccount out of spce Explntion: The opertion during which the cloud ccount rn out of spce is indicted by the ssocited event code: v Cloud ccount out of spce during cloud storge snpshot opertion v Cloud ccount out of spce during cloud snpshot restore commit opertion v Cloud ccount out of spce during cloud snpshot delete opertion The user response is the sme in ll cses. Contct your cloud service provider to dd more cloud storge spce System SSL certificte hs expired. Explntion: System SSL certificte hs expired. Connections to the GUI, service ssistnt, nd CIMOM re likely to generte security exceptions. Complete the following steps to resolve this problem. 1. Access the CLI by using ssh. 2. Check tht the system time nd dte is correct. If it is incorrect, it cn cuse the certificte to be incorrectly mrked s expired. 3. Crete new self-signed system certificte, or crete certificte request. Get it signed by your certificte uthority nd instll the signed request. Note: If it tkes some time to get certificte signed, you cn lso crete self-signed certificte to use while you wit for your request to be signed. v N/A 2259 Storwize V7000 Gen1 comptibility mode cn now be disbled on this system. Explntion: No more Storwize V7000 Gen1 cnisters re ttched to the system. Complete one of the following ctions: v If you wnt to disble Storwize V7000 Gen1 comptibility mode, enter the following commnd: chsystem -gen1comptibilitymode no v If you wnt to mintin Storwize V7000 Gen1 comptibility mode, you cn rettch Storwize V7000 Gen1 cnisters to the cluster Cloud ccount not vilble, SSL certificte problem Explntion: The cloud ccount is using SSL ( URL or Amzon) nd problem ws found with the certificte. The most likely outcome is tht new certificte must be instlled. The exct mening of the error code depends on the ssocited event code Cloud ccount not vilble, no mtching CA certificte The cloud ccount provider tht is ssocited with the ccount presented n SSL certificte. The system cnnot ccess mtching root CA (certificte uthority) certificte Cloud ccount not vilble, expired SSL certificte The SSL certificte tht is instlled on the system tht is ssocited with the cloud ccount is expired or is not 224 SAN Volume Controller: Troubleshooting Guide

239 yet ctive. Cloud bckup services remin pused until the lert is fixed. For event code : v For privte cloud, contct the dministrtor of the cloud. Request the CA certificte nd instll it. v For public cloud, it is likely tht you need to upgrde the softwre on your node. For event code : 1. Check the vlid_not_before nd vlid_not_fter dtes from lert sense dt. 2. Verify tht the system time is correct. 3. Complete one of the following ctions: v For privte cloud, contct the dministrtor of the cloud. Request new certificte nd instll it. v For public cloud, you might need to updte your softwre license. If your license is correct, contct the dministrtor of the cloud, request new certificte, nd instll it No uthoriztion to perform cloud opertion Explntion: The cloud ccount ws configured with credentils (for Amzon, AWS ccess key; for Swift, user/tennt/pssword) tht re not sufficient to use the cloud storge. The system cn log in, but the specified user does not hve permission to complete one or more of the following opertions: v Uplod dt. Required to crete cloud snpshot. v Crete continer in cloud storge. Required to crete cloud snpshot. v Downlod dt. Required to complete restore opertion. v Delete dt. Required to delete cloud snpshot. The error code is ssocited with the following lert event: Cloud ccount not vilble, cnnot obtin permission to use cloud storge Complete the following steps: 1. Use the lscloudccount commnd to disply cloud ccount informtion nd verify tht everything is correct. 2. Verify tht the system time is correct. Some cloud providers re sensitive to time differences. 3. Check the cloud service provider console or contct the cloud dministrtor to confirm tht the correct permissions re in plce for the user. 4. Fix the lert to retry the cloud opertion Cloud ccount not vilble, cnnot contct cloud provider Explntion: The system cnnot mke n IP connection over the mngement network from the config node to the cloud. Try the following ctions: 1. Check for higher-priority unfixed errors. The system might be reporting network errors. Fix these errors first, nd this lert might then uto-fix. 2. For SWIFT cloud ccount, check the endpoint URL. If this URL is chnged to one tht is working, the event uto-fixes. 3. Use ping or trceroute with the cloud endpoint IP ddress to try to locte where the connection is being lost. For Amzon Web Services, use s3.mzonws.com s the endpoint ddress Cloud ccount not vilble, cnnot communicte with cloud provider Explntion: The locl system cn mke n IP connection to the server, but the server is not replying properly to cloud storge protocol commnds. The most likely problem is configurtion error on the locl system, such s n IP ddress tht needs updting fter the server chnged its IP ddress. The remining problems re on the server side. This error is most likely to occur with privte cloud instlltions. Try the following ctions: 1. Check your configurtion settings. If you chnge setting tht results in vlid configurtion, the event uto-fixes. 2. Contct the cloud service provider dministrtor Cloud ccount not vilble, cloud provider login error Explntion: A problem ws reported with the credentils tht were submitted to the cloud ccount object. For Amzon, the credentil is n AWS ccess key. For SWIFT, the credentils consist of user nme, tennt, nd pssword. The mening of the error code depends on the ssocited event code Cloud ccount not vilble, cnnot uthenticte with cloud provider The cloud service provider rejected the credentils tht re ssocited with the cloud ccount. Cloud bckup services remin pused until the lert is fixed. For some public cloud providers, including AWS S3, this lert cn occur if the system time devites more thn 15 minutes from stndrd time. This lert cn lso occur fter full system (T4) recovery if your credentils re lost Cloud ccount not vilble, cnnot obtin permission to use cloud storge The cloud service provider ccepted the credentils tht Chpter 6. Dignosing problems 225

240 re ssocited with the cloud ccount, but the system is not llowed to run cloud storge opertions. Cloud bckup services remin pused until the lert is fixed. For event code : 1. Verify tht you re using the correct credentils. 2. Verify tht the system time is correct. 3. Contct the cloud service provider to see whether your pssword ws chnged on the cloud side. 4. Fix the lert to retry the login. For event code : 1. Verify tht you re using the correct credentils. 2. Contct the cloud service provider to provide sufficient permission for your ccount. 3. Fix the lert to retry the login A secure shell (SSH) session limit for the cluster hs been reched. Explntion: Secure Shell (SSH) sessions re used by pplictions tht mnge the cluster. An exmple of such n ppliction is the commnd-line interfce (CLI). An ppliction must initilly log in to the cluster to crete n SSH session. The cluster imposes limit on the number of SSH sessions tht cn be open t one time. This error indictes tht the limit on the number of SSH sessions hs been reched nd tht no more logins cn be ccepted until current session logs out. The limit on the number of SSH sessions is usully reched becuse multiple users hve opened n SSH session but hve forgotten to close the SSH session when they re no longer using the ppliction. v Becuse this error indictes problem with the number of sessions tht re ttempting externl ccess to the cluster, determine the reson tht so mny SSH sessions hve been opened. v Run the Fix Procedure for this error on the pnel t Mngement GUI Troubleshooting > Recommended Actions to view nd mnge the open SSH sessions Encryption key on USB flsh drive removed Explntion: The USB flsh drive in prticulr node or port hs been removed. This USB flsh drive contined vlid encryption key for the system. Unuthorized removl cn compromise dt security. If your dt hs been compromised, perform rekey opertion immeditely Encryption key error on USB flsh drive. Explntion: It is necessry to provide n encryption key before the system cn become fully opertionl. This error occurs when the encryption key identified is invlid. A file with the correct nme ws found but the key in the file is corrupt. port. Remove the USB flsh drive from the 2560 Drive write endurnce usge rte high Explntion: Flsh drives hve limited write endurnce. A high usge rte is leding drive to filure erlier thn expected. Complete the following steps: 1. Check the event log for the ID of the drive with the high usge rte. 2. Run the lsdrive commnd nd note the dte in the Predicted Filure Dte field. 3. If the predicted filure dte is pproching, consider replcing the drive. 4. Mrk the event s fixed Node IP is missing Explntion: At lest two IP ddresses re required for ech node. Use the stsk chnodeip commnd to dd the required IP ddresses The cluster ws unble to send n emil. Explntion: The cluster hs ttempted to send n emil in response to n event, but there ws no cknowledgement tht it ws successfully received by the SMTP mil server. It might hve filed becuse the cluster ws unble to connect to the configured SMTP server, the emil might hve been rejected by the server, or timeout might hve occurred. The SMTP server might not be running or might not be correctly configured, or the cluster might not be correctly configured. This error is not logged by the test emil function becuse it responds immeditely with result code. v Ensure tht the SMTP emil server is ctive. v Ensure tht the SMTP server TCP/IP ddress nd port re correctly configured in the cluster emil configurtion. v Send test emil nd vlidte tht the chnge hs corrected the issue. v Mrk the error tht you hve just repired s fixed. v Go to MAP 5700: Repir verifiction. 226 SAN Volume Controller: Troubleshooting Guide

241 v None 2601 Error detected while sending n emil. Explntion: An error hs occured while the cluster ws ttempting to send n emil in response to n event. The cluster is unble to determine if the emil hs been sent nd will ttempt to resend it. The problem might be with the SMTP server or with the cluster emil configurtion. The problem might lso be cused by filover of the configurtion node. This error is not logged by the test emil function becuse it responds immeditely with result code. v If there re higher-priority unfixed errors in the log, fix those errors first. v Ensure tht the SMTP emil server is ctive. v Ensure tht the SMTP server TCP/IP ddress nd port re correctly configured in the cluster emil configurtion. v Send test emil nd vlidte tht the chnge hs corrected the issue. v Mrk the error tht you hve just repired s fixed. v Go to MAP 5700: Repir verifiction. v None 2700 Unble to ccess NTP network time server Explntion: Cluster time cnnot be synchronized with the NTP network time server tht is configured. exmine: There re three min cuses to v The cluster NTP network time server configurtion is incorrect. Ensure tht the configured IP ddress mtches tht of the NTP network time server. v The NTP network time server is not opertionl. Check the sttus of the NTP network time server. v The TCP/IP network is not configured correctly. Check the configurtion of the routers, gtewys nd firewlls. Ensure tht the cluster cn ccess the NTP network time server nd tht the NTP protocol is permitted. The error will utomticlly fix when the cluster is ble to synchronize its time with the NTP network time server. v None 2702 Check configurtion settings of the NTP server on the CMM Explntion: The node is configured to utomticlly set the time using n NTP server within the CMM. It is not possible to connect to the NTP server during uthentiction. The NTP server configurtion cnnot be chnged within S-ITE. Within the CMM, there re chngeble NTP settings. However, these settings configure how the CMM gets the time nd dte - the internl CMM NTP server tht is used by the S-ITE cnnot be chnged or configured. This event is only rised when n ttempt is mde to use the server - once every hlf hour. Note: The NTP configurtion settings re re-red from the CMM before ech connection. The reson for connection error cn be due to the following: v ll suitble Ethernet ports re offline v the CMM hrdwre is not opertionl v the CMM is ctive but the CMM NTP server is offline. The reson for n uthentiction issue cn be due to the following: v the uthentiction vlues provided were invlid v the NTP server rejected the uthentiction key provided to the node by the CMM. If the NTP port is n unsupported vlue, port error cn disply. Currently, only port 123 is supported. Only the current configurtion node ttempts to resync with the server. 1. Mke sure tht CMM is opertionl by logging in nd confirming its time. 2. Check tht the IP ddress in the event log cn be pinged from the node. 3. If there is n error, try rebooting the CMM Internl uninterruptible power supply softwre error detected. Explntion: Some of the tests tht re performed during node strtup did not complete becuse some of the dt reported by the uninterruptible power supply during node strtup is inconsistent becuse of softwre error in the uninterruptible power supply. The node hs determined tht the uninterruptible power supply is functioning sufficiently for the node to continue opertions. The opertion of the cluster is not ffected by this error. This error is usully resolved by power cycling the uninterruptible power supply. 1. Power cycle the uninterruptible power supply t convenient time. The one or two nodes ttched to Chpter 6. Dignosing problems 227

242 the uninterruptible power supply should be powered off before powering off the uninterruptible power supply. Once the nodes hve powered off, wit 5 minutes for the uninterruptible power supply to go into stndby mode (flshing green AC LED). If this does not hppen utomticlly then check the cbling to confirm tht ll nodes powered by this uninterruptible power supply hve been powered off. Remove the power input cble from the uninterruptible power supply nd wit t lest 2 minutes for the uninterruptible power supply to cler its internl stte. Reconnect the uninterruptible power supply power input cble. Press the uninterruptible power supply ON button. Power on the nodes connected to this uninterruptible power supply. 2. If the error is reported gin fter the nodes re restrted replce the 2145 UPS electronics ssembly. v 2145 UPS electronics ssembly (5%) Other: v Trnsient 2145 UPS error (95%) 3024 Technicin port connection invlid Explntion: The code hs detected more thn one MAC ddress through the connection, or the DHCP hs given out more thn one ddress. The code thus believes there is switch ttched. 1. Remove the cble from the technicin port. 2. (Optionl) Disble dditionl network dpters on the lptop to which it is to connected. 3. Ensure DHCP is enbled on the network dpter. 4. If this ws not possible, mnully set the IP to Connect stndrd Ethernet cble between the network dpter nd the technicin port. 6. If this still does not work, reboot the node nd repet the bove steps. 7. This event will uto-fix once either no connection or vlid connection hs been detected A virtuliztion feture license is required. Explntion: The cluster hs no virtuliztion feture license registered. You should hve either n Entry Edition Physicl Disk virtuliztion feture license or Cpcity virtuliztion feture license tht covers the cluster. The cluster will continue to operte, but it might be violting the license conditions. v If you do not hve virtuliztion feture license tht is vlid nd sufficient for this cluster, contct your IBM sles representtive, rrnge license nd chnge the license settings for the cluster to register the license. v The error will utomticlly fix when the sitution is resolved. v None 3029 Virtuliztion feture cpcity is not vlid. Explntion: The setting for the mount of spce tht cn be virtulized is not vlid. The vlue must be n integer number of terbytes. This error event is creted when cluster is upgrded from version prior to to version or lter. Prior to version the virtuliztion feture cpcity vlue ws in gigbytes nd therefore could be set to frction of terbyte. With version nd lter the licensed cpcity for the virtuliztion feture must be n integer number of terbytes. v Review the license conditions for the virtuliztion feture. If you hve one cluster, chnge the license settings for the cluster to mtch the cpcity tht is licensed. If your license covers more thn one cluster, pportion n integer number of terbytes to ech cluster. You might hve to chnge the virtuliztion cpcity tht is set on the other clusters to ensure tht the sum of the cpcities for ll of the clusters does not exceed the licensed cpcity. v You cn view the event dt or the feture log to ensure tht the licensed cpcity is sufficient for the spce tht is ctully being used. Contct your IBM sles representtive if you wnt to chnge the cpcity of the license. v This error will utomticlly be fixed when vlid configurtion is entered. v None 3030 Globl nd Metro Mirror feture cpcity not set. Explntion: The Globl nd Metro Mirror feture is set to On for the system, but the cpcity hs not been set. Perform one of the following ctions: v Chnge the Globl nd Metro Mirror license settings for the system either to the licensed Globl nd Metro Mirror cpcity, or if the license pplies to more thn one system, to the portion of the license 228 SAN Volume Controller: Troubleshooting Guide

243 llocted to this system. Set the licensed Globl nd Metro Mirror cpcity to zero if it is no longer being used. v View the event dt or the feture log to ensure tht the licensed Globl nd Metro Mirror cpcity is sufficient for the spce ctully being used. Contct your IBM sles representtive if you wnt to chnge the licensed Globl nd Metro Mirror cpcity. v The error will utomticlly be fixed when vlid configurtion is entered. v None 3031 FlshCopy feture cpcity not set. Explntion: The FlshCopy feture is set to On for the system, but the cpcity hs not been set. Perform one of the following ctions: v Chnge the FlshCopy license settings for the system either to the licensed FlshCopy cpcity, or if the license pplies to more thn one system, to the portion of the license llocted to this system. Set the licensed FlshCopy cpcity to zero if it is no longer being used. v View the event dt or the feture log to ensure tht the licensed FlshCopy cpcity is sufficient for the spce ctully being used. Contct your IBM sles representtive if you wnt to chnge the licensed FlshCopy cpcity. v The error will utomticlly be fixed when vlid configurtion is entered. v None 3032 Feture license limit exceeded. Explntion: The mount of spce tht is licensed for cluster feture is being exceeded. The feture tht is being exceeded might be: v Virtuliztion (event identifier ) v FlshCopy (event identifier ) v Globl nd Metro Mirror (event identifier ) v Trnsprent cloud tiering (event identifier ) The cluster will continue to operte, but it might be violting the license conditions. v Determine which feture license limit hs been exceeded. This might be: Virtuliztion (event identifier ) FlshCopy (event identifier ) Globl nd Metro Mirror (event identifier ) Trnsprent cloud tiering (event identifier ) v Use the lslicense commnd to view the current license settings. v Ensure tht the feture cpcity tht is reported by the cluster hs been set to mtch either the licensed size, or if the license pplies to more thn one cluster, to the portion of the license tht is llocted to this cluster. v Decide whether to increse the feture cpcity or to reduce the spce tht is being used by this feture. v To increse the feture cpcity, contct your IBM sles representtive nd rrnge n incresed license cpcity. Chnge the license settings for the cluster to set the new licensed cpcity. Alterntively, if the license pplies to more thn one cluster modify how the licensed cpcity is pportioned between the clusters. Updte every cluster so tht the sum of the license cpcity for ll of the clusters does not exceed the licensed cpcity for the loction. v To reduce the mount of disk spce tht is virtulized, delete some of the mnged disks or imge mode volumes. The used virtuliztion size is the sum of the cpcities of ll of the mnged disks nd imge mode disks. v To reduce the FlshCopy cpcity delete some FlshCopy mppings. The used FlshCopy size is the sum of ll of the volumes tht re the source volume of FlshCopy mpping. v To reduce Globl nd Metro Mirror cpcity delete some Globl Mirror or Metro Mirror reltionships. The used Globl nd Metro Mirror size is the sum of the cpcities of ll of the volumes tht re in Metro Mirror or Globl Mirror reltionship; both mster nd uxiliry volumes re counted. v To reduce the number of I/O groups tht use Trnsprent cloud tiering, disble cloud snpshots for ll cloud snpshot-enbled volumes from individul I/O groups until the totl number of I/O groups using trnsprent cloud tiering is below the license limit. v The error will utomticlly be fixed when the licensed cpcity is greter thn the cpcity tht is being used. v None 3035 Physicl Disk FlshCopy feture license required Explntion: The Entry Edition cluster hs some FlshCopy mppings defined. There is, however, no Physicl Disk FlshCopy license registered on the cluster. The cluster will continue to operte, but it might be violting the license conditions. v Check whether you hve n Entry Edition Physicl Disk FlshCopy license for this cluster tht you hve Chpter 6. Dignosing problems 229

244 not registered on the cluster. Updte the cluster license configurtion if you hve license. v Decide whether you wnt to continue to use the FlshCopy feture or not. v If you wnt to use the FlshCopy feture contct your IBM sles representtive, rrnge license nd chnge the license settings for the cluster to register the license. v If you do not wnt to use the FlshCopy feture, you must delete ll of the FlshCopy mppings. v The error will utomticlly fix when the sitution is resolved. v None 3036 Physicl Disk Globl nd Metro Mirror feture license required Explntion: The Entry Edition cluster hs some Globl Mirror or Metro Mirror reltionships defined. There is, however, no Physicl Disk Globl nd Metro Mirror license registered on the cluster. The cluster will continue to operte, but it might be violting the license conditions. v Check if you hve n Entry Edition Physicl Disk Globl nd Metro Mirror license for this cluster tht you hve not registered on the cluster. Updte the cluster license configurtion if you hve license. v Decide whether you wnt to continue to use the Globl Mirror or Metro Mirror fetures or not. v If you wnt to use either the Globl Mirror or Metro Mirror feture contct your IBM sles representtive, rrnge license nd chnge the license settings for the cluster to register the license. v If you do not wnt to use both the Globl Mirror nd Metro Mirror fetures, you must delete ll of the Globl Mirror nd Metro Mirror reltionships. v The error will utomticlly fix when the sitution is resolved. v None 3060 Arry write endurnce limited Explntion: A RAID MDisk is ffected by member flsh drives tht hve limited remining write endurnce. Complete the following steps: 1. Check the event log for the ID of the MDisk with limited remining write endurnce. 2. Run the lsmdisk nd lsdrive commnds to disply informtion bout the rry nd the individul drives. Note the dte in the Replcement Dte field for ech drive in the lsdrive results. 3. If the replcement dte or dtes re pproching, consider replcing individul drives, or replcing the entire rry. 4. Mrk the event s fixed Globl or Metro Mirror reltionship or consistency group with deleted prtnership Explntion: A Globl Mirror or Metro Mirror reltionship or consistency group exists with cluster whose prtnership is deleted. This configurtion is not supported nd the problem should be resolved. The issue cn be resolved either by deleting ll of the Globl Mirror or Metro Mirror reltionships or consistency groups tht exist with cluster whose prtnership is deleted, or by recreting ll of the prtnerships tht they were using. The error will utomticlly fix when the sitution is resolved. 1. List ll of the Globl Mirror nd Metro Mirror reltionships nd note those where the mster cluster nme or the uxiliry cluster nme is blnk. For ech of these reltionships, lso note the cluster ID of the remote cluster. 2. List ll of the Globl Mirror nd Metro Mirror consistency groups nd note those where the mster cluster nme or the uxiliry cluster nme is blnk. For ech of these consistency groups, lso note the cluster ID of the remote cluster. 3. Determine how mny unique remote cluster IDs there re mong ll of the Globl Mirror nd Metro Mirror reltionships nd consistency groups tht you hve identified in the first two steps. For ech of these remote clusters, decide if you wnt to re-estblish the prtnership with tht cluster. Ensure tht the totl number of prtnerships tht you wnt to hve with remote clusters does not exceed the cluster limit. If you re-estblish prtnership, you will not hve to delete the Globl Mirror nd Metro Mirror reltionships nd consistency groups tht use the prtnership. 4. Re-estblish ny selected prtnerships. 5. Delete ll of the Globl Mirror nd Metro Mirror reltionships nd consistency groups tht you listed in either of the first two steps whose remote cluster prtnership hs not been re-estblished. 6. Check tht the error hs been mrked s fixed by the system. If it hs not, return to the first step nd determine which Globl Mirror or Metro Mirror reltionships or consistency groups re still cusing the issue. 230 SAN Volume Controller: Troubleshooting Guide

245 v None 3081 Unble to send emil to ny of the configured emil servers. Explntion: Either the system ws not ble to connect to ny of the SMTP emil servers, or the emil trnsmission hs filed. A mximum of six emil servers cn be configured. Error event 2600 or 2601 is rised when n individul emil server is found to be not working. This error indictes tht ll of the emil servers were found to be not working. v Check the event log for ll unresolved 2600 nd 2601 errors nd fix those problems. v If this error hs not lredy been utomticlly mrked fixed, mrk this error s fixed. v Perform the check emil function to test tht n emil server is operting properly. v None 3090 Drive firmwre downlod is cncelled by user or system, problem dignosis required. Explntion: The drive firmwre downlod hs been cncelled by the user or the system nd problem dignosis required. If you cncelled the downlod using pplydrivesoftwre -cncel then this error is to be expected. If you chnged the stte of ny drive while the downlod ws ongoing, this error is to be expected, however you will hve to rerun the pplydrivesoftwre to ensure ll your drive firmwre hs been updted. Otherwise: 1. Check the drive sttes using lsdrive, in prticulr look t drives which re sttus=degrded, offline or use=filed. 2. Check node sttes using lsnode or lsnodecnister, nd confirm ll nodes re online. 3. Use lsdependentvdisks -drive <drive_id> to check for vdisks tht re dependent on specific drives. 4. If the drive is member of RAID0 rry, consider whether to introduce dditionl redundncy to protect the dt on tht drive. 5. If the drive is not member of RAID0 rry, fix ny errors in the event log tht relte to the rry. 6. Consider using the -force option. With ny drive softwre upgrde there is risk tht the drive might become unusble. Only use the -force option if you ccept this risk. 7. Reissue the pplydrivesoftwre gin. Note: The lsdriveupgrdeprogress commnd cn be used to check the progress of the pplydrivesoftwre commnd s it updtes ech drive Cloud ccount not vilble, unexpected error Explntion: The mening of the error code depends on the ssocited event code Cloud ccount not vilble, cnnot estblish secure connection with cloud provider The network connection between the system nd the cloud service provider is configured to use SSL. The SSL connection cnnot be estblished. Cloud bckup services remin pused until the lert is fixed. The issue is not tht the system cnnot locte the CA certificte for the cloud service provider, or tht the CA certificte is expired Cloud ccount not vilble, cnnot complete cloud storge opertion An unexpected error occurred when the system ttempted to complete cloud storge opertion. event code: Try the following ctions for either 1. Mrk the error s fixed so tht the system retries the opertion. 2. If the errors repet, check the cloud provider console or contct the cloud service provider. Look for errors nd for chnges since the lst successful connection. The SSL connection worked t the time tht the cloud ccount object ws creted. 3. Contct your service support representtive. If possible, provide your representtive with debug dt from livedump nd snp Unexpected error occurred while doing cloud opertion Explntion: The ssocited event codes provide more informtion bout specific error: A cloud object could not be found during cloud snpshot opertion. The system encountered problem when it tried to red prticulr object from cloud storge. The object is missing in the cloud A cloud object ws found to be corrupt during cloud snpshot opertion. The system encountered problem when it tried to red prticulr object from cloud storge. The object formt is wrong or the object longitudinl redundncy check (LRC) filed. Chpter 6. Dignosing problems 231

246 A cloud object ws found to be corrupt during cloud snpshot decompression opertion. The system encountered checksum filure while it ws decompressing prticulr object from cloud storge Etg integrity error during cloud snpshot opertion While the system ws creting snpshot in cloud storge, it encountered n HTML entity tg integrity error Unexpected error occurred, cnnot complete cloud snpshot opertion An unnticipted error occurred during snpshot opertion A cloud object could not be found during cloud snpshot restore opertion The system encountered problem when it tried to red prticulr object from cloud storge during restore opertion. The object is missing in the cloud A cloud object ws found to be corrupt during cloud snpshot restore opertion The system encountered problem when it tried to red prticulr object from cloud storge during restore opertion. The object formt is wrong or the object longitudinl redundncy check (LRC) filed A cloud object ws found to be corrupt during cloud snpshot restore decompression opertion The system encountered checksum filure while it ws decompressing prticulr object from cloud storge during restore opertion Etg integrity error during cloud snpshot restore opertion During restore opertion, the system encountered n HTML entity tg integrity error Cnnot crete bd blocks on mnged disk during cloud snpshot restore opertion. The system cnnot work round medium errors on the cloud volume during restore opertion Unexpected error occurred, cnnot complete cloud snpshot restore opertion An unnticipted error occurred during restore opertion A cloud object could not be found during cloud snpshot delete opertion The system encountered problem when it tried to red prticulr object from cloud storge during delete opertion. The object is missing in the cloud A cloud object ws found to be corrupt during cloud snpshot delete opertion The system encountered problem when it tried to red prticulr object from cloud storge during delete opertion. The object formt is wrong or the object longitudinl redundncy check (LRC) filed A cloud object ws found to be corrupt during cloud snpshot delete decompression opertion The system encountered checksum filure while it ws decompressing prticulr object from cloud storge during delete opertion Unexpected error occurred, cnnot complete cloud snpshot delete opertion An unnticipted error occurred during delete opertion. In ll cses, the job remins pused until the lert is fixed. representtive. Contct your support service 3123 The quorum ppliction needs to be redeployed. Explntion: A setting specific to the quorum ppliction chnged, which mens tht the quorum ppliction might not be ble to function s the ctive quorum device. Any of the following problems might be involved: v A service IP ws chnged. v A chnge in the IP network prevented the quorum ppliction from reching ll the nodes. v One or more nodes were permnently dded to or removed from the cluster. v The certificte ws chnged. Complete the following steps: 1. Mke sure tht ll Ethernet cbles re connected correctly. 2. Mke sure tht the service IP ddresses re set for ll nodes. 3. Mke sure tht you cn ping ll nodes from the quorum ppliction host. 4. Regenerte the JAR file tht contins the new configurtion by using the mngement GUI or the commnd line. 5. Trnsfer the new ppliction to the deployment loctions or the host or hosts. 6. Stop the old ppliction. 7. Strt the new ppliction. 8. Verify tht the cluster is using the quorum ppliction s the ctive quorum device by using the lsquorum commnd. 232 SAN Volume Controller: Troubleshooting Guide

247 No ctive quorum device found. Explntion: A quorum device must be ctive to void n I/O outge if the node fils. Use the lsquorum commnd to verify tht quorum device is ctive. The ctive field should hve vlue of yes. If no quorum devices re ctive, complete one of the following ctions: v On HyperSwp or stretched systems, deploy new IP quorum ppliction or crete third Fibre Chnnel quorum site. v On regulr systems, crete some mnged storge or deploy new IP quorum ppliction System SSL certificte expires within the next 30 dys. Explntion: System SSL certificte expires within the next 30 dys. The system SSL certificte tht is used to uthenticte connections to the GUI, service ssistnt, nd the CIMOM is bout to expire. Complete the following steps to resolve this problem. 1. If you re using self-signed certificte, then generte new self-signed certificte. 2. If you re using certificte tht is signed by certificte uthority, generte new certificte request nd get this certificte signed by your certificte uthority. The existing certificte cn continue to be used until the expiry dte to provide time to get the new certificte request signed nd instlled. v N/A 3135 Cloud ccount not vilble, incomptible object dt formt Explntion: The cloud ccount is in import mode, ccessing dt from nother system. The code on tht system ws updted to level higher thn the level on your current system. The other system mde updtes to the cloud storge tht your current system cnnot interpret. Try the following ctions: 1. Contct the dministrtor of the other system to determine its code level nd the chnges tht re plnned. Use lscloudccount to get the ID nd nme for the other system. 2. Updte your current system to comptible level of code. 3. Alterntively, chnge the cloud ccount bck to norml mode Cloud ccount SSL certificte will expire within the next 30 dys Explntion: A cloud ccount SSL certificte ws presented tht is due to expire. Try the following ctions: 1. Verify certificte vlidity strt nd end times from the lert event sense dt. 2. Verify tht the system time is correct. 3. Contct the cloud service provider for new certificte. Note: The lert does not uto-fix until the certificte becomes vlid or the ccount is switched out of SSL mode Equivlent ports my be on different fbrics Explntion: Mismtched fbric World Wide Nmes (WWNs) were detected. Complete the following steps: 1. Run the lsportfc commnd to get the fbric World Wide Nme (WWN) of ech port. 2. List ll prtnered ports (tht is, ll ports for which the pltform port ID is the sme, nd the node is in the sme I/O group) tht hve mismtched fbric WWNs. 3. Verify tht the listed ports re on the sme fbric. 4. Rewire if needed. For informtion bout wiring requirements, see "Zoning considertions for N_Port ID Virtuliztion" in your product documenttion. After ll ports re on the sme fbric, the event corrects itself. 5. This error might be displyed by mistke. If you confirm tht ll remining ports re on the sme fbric, despite pprent mismtches tht remin, mrk the event s fixed Performnce not optimised for configurtion. Explntion: A V9000 cluster cn operte with the fibre queue switch set to ON or OFF. The optimum setting is determined utomticlly by the system bsed on whether you re mnging ny AE2 enclosures. If so, the switch must be ON for optiml performnce. If the cluster detects tht it is not in the correct performnce mode, the 3300 error is displyed. This sitution typiclly occurs when the fibre queue switch is mnully chnged by using the mngement GUI or the chenclosure commnd. Enter the following commnd for ech node in the system, in turn, to restrt the I/O process: stsk stopnode -wrmstrt This commnd clers the error. Chpter 6. Dignosing problems 233

248 Procedure: SAN problem determintion You cn solve problems on the system nd its connection to the storge re network (SAN). About this tsk SAN filures might cuse system volumes to be inccessible to host systems. Filures cn be cused by SAN configurtion chnges or by hrdwre filures in SAN components. The following list identifies some of the hrdwre tht might cuse filures: v Power, fn, or cooling v Appliction-specific integrted circuits v Instlled smll form-fctor pluggble (SFP) trnsceiver v Fiber-optic cbles If either the mintennce nlysis procedures or the error codes sent you here, complete the following steps: Procedure 1. If the SAN configurtion ws chnged by chnging the Fibre Chnnel cble connections or switch zoning, verify tht the chnges re correct nd, if necessry, reverse those chnges. 2. Verify tht the power is turned on to ll switches nd storge controllers tht the system uses, nd tht they re not reporting ny hrdwre filures. If problems re found, resolve those problems before you proceed further. 3. Verify tht the Fibre Chnnel cbles tht connect the systems to the switches re securely connected. 4. If you hve SAN mngement tool, use tht tool to view the SAN topology nd isolte the filing component. Resolving problem with SSL/TLS clients Chnging the security level of the system might cuse the web interfce, CIM clients, nd other SSL/TLS clients to stop working. If ny clients stop working, complete the following procedure. Procedure 1. Wit 5 minutes nd try gin. The clients might still need to wit for the services to restrt. 2. Confirm tht the SSL/TLS implementtion of the client (for exmple, the web browser or CIM mngement tool) is up to dte nd supports the level of security tht is being enforced. If necessry, revert to weker SSL/TLS security level in the system nd see whether this ction resolves the issue. 3. If the problem is browser problem, check the exct error messge tht is reported by the browser. If the error messge is cipher error, SSL error, TLS error, or hndshke error, then the error implies tht there is problem with the secure connection. In this cse, confirm tht the browser is up to dte. All of the supported browsers (Internet Explorer, Firefox, Firefox ESR, nd Chrome) support TLS 1.2 t the ltest version. 234 SAN Volume Controller: Troubleshooting Guide

249 If there is only blnk screen, it is likely tht either the web service needs to restrt, or there is problem unrelted to the security level. Procedure: Mking drives support protection informtion You cn use this procedure to migrte drives nd rrys to pick up support for protection informtion. About this tsk Drives cnnot strt by using protection informtion for I/O requests on demnd. They must be vlidted s hving correct formt nd generl support for the function within the code. The system cn vlidte the formt nd generl support when the drive object is first discovered by the system. The requirement for system vlidtion mens tht no drive tht exists cn use protection informtion on n updte from version 730 regrdless of use in the configurtion. The system cn reject request to mke drive cndidte if the medi is not formtted correctly for use with protection informtion. The process to use protection informtion on n existing drive is to use the system interfce (GUI/CLI) nd involves unmnging nd rediscovering the drive to llow the softwre to recquire the drive chrcteristics. The lsdrive view contins the protection_enbled field tht shows whether drive is using protection informtion. Drives nd rrys tht exist on n updte to version 740 do not utomticlly pick up support for protection informtion. All newly discovered drives t this code level support protection informtion. If the system hs spre cpcity, then migrtion cn proceed n MDisk t time. Otherwise, the migrtion to using protection informtion on drives proceeds drive by drive. To migrte MDisk tht is using spre storge cpcity, complete the following procedure. Procedure 1. Migrte dt off the MDisk. The dt migrtion cn be ccomplished by MDisk migrtion s prt of MDisk delete (rmmdisk, lsmigrte) within storge pool. You cn lso use volume mirroring to crete n in-sync mirrored copy of ech volume in nother pool (ddvdiskcopy). When it is copied (lsvdisksyncprogress), delete the originl volume copies (rmvdiskcopy), nd then delete the MDisk (rmmdisk) tht hs no dt. 2. Follow the instructions in step 5 for ll the drives tht re now cndidtes when the MDisk is deleted (see lsmigrte). 3. Re-crete the rry by using the system interfce when ll old members dopt protection informtion. 4. If the drive is member, complete the following steps to dopt protection informtion on n individul drive.. Run the chrrymember commnd to eject the drive from the rry (either immeditely with redundncy loss or fter n exchnge). b. When the drive is no longer member, follow the instructions in step 5 for cndidtes or spres. c. Repet for the next member. 5. If the drive is spre or cndidte, complete the following steps:. Use the mngement GUI to tke the drive offline. Chpter 6. Dignosing problems 235

250 b. When the drive is offline, use the system interfces to chnge the drive's use to unused. c. The system recquires the drive nd brings it bck online, possibly chnging the drive ID. d. Attempt to mke the drive cndidte. Depending on the drive, this step might generte error CMMVC6624E, The commnd cnnot be initited becuse the drive is not in n pproprite stte to perform the tsk. This step is necessry to run the formt commnd in the next step. e. Run the following formt commnd. svctsk chdrive -tsk formt drive_id f. Wit pproximtely 3 minutes until the drive is online gin. Use lsdrive drive_id to see the drive's online/offline sttus. g. Use the system interfce to chnge the drive's use to cndidte. If required, use the system interfce to chnge the drive's use to spre. h. Enter lsdrive drive_id nd check tht the protection_enbled field is yes. This drive cn now be used in n rry. Resolving problem with new expnsion enclosures Determine why newly instlled expnsion enclosure ws not detected by the system. When you instll new expnsion enclosure, follow the mngement GUI Add Enclosure wizrd. Select Monitoring > System. From the Actions menu, select Add Enclosures. If the expnsion enclosure is not detected, complete the following verifictions: v Verify the sttus of the LEDs t the bck of the expnsion enclosure. At lest one power supply unit must be on with no fults shown. At lest one cnister must be ctive, with no fult LED on. SAN Volume Controller F nd F enclosures hve two LEDs per Seril-ttched SCSI (SAS) port: one green link-sttus LED nd one mber fult LED. The link sttus LED of the ports tht re in use is on while the fult LED is off. For detils bout LED sttus, see SAN Volume Controller F expnsion cnister SAS ports nd indictors nd SAN Volume Controller F expnsion enclosure LEDs. v Verify tht the SAS cbling to the expnsion enclosure is correctly instlled. To review the requirements, see Connecting the optionl 2U SAS expnsion enclosures to the 2145-DH8, Connecting the optionl 2U SAS expnsion enclosures to the 2145-SV1, nd Connecting the optionl F SAS expnsion enclosures. Fibre Chnnel nd 10G Ethernet link filures You might need to replce the smll form-fctor pluggble (SFP) trnsceiver when filure occurs on single Fibre Chnnel or 10G Ethernet link (pplicble to Fibre Chnnel over Ethernet personlity enbled 10G Ethernet link). Before you begin The following items cn indicte tht single Fibre Chnnel or 10G Ethernet link filed: v The Fibre Chnnel port sttus on the front pnel of the node 236 SAN Volume Controller: Troubleshooting Guide

251 v The Fibre Chnnel sttus light-emitting diodes (LEDs) t the rer of the node v An error tht indictes tht single port filed (703, 723). About this tsk Use only IBM supported 10 Gb SFP trnsceivers with the SAN Volume Controller 2145-DH8. Using ny other SFP trnsceivers cn led to unexpected system behvior. Copper DAC is not supported by these 10 Gb ports. The SFP trnsceiver replcement in 10 Gbps Ethernet dpter port is governed by the following rules: v An existing 10 Gb SFP trnsceiver tht is replced with new 10 Gb SFP trnsceiver: The 10 Gbps Ethernet dpter port detects new SFP trnsceiver nd becomes opertionl immeditely. v If the 10 Gbps Ethernet dpter port detects new SFP trnsceiver nd becomes opertionl immeditely, the port hs n incorrect SFP trnsceiver since the lst reboot. The SFP trnsceiver is then replced with the correct 10 Gb SFP trnsceiver. This sitution cn occur with n incomptible SFP trnsceiver (8 Gb SFP or 4 Gb SFP) tht is inserted in the 10 Gbps Ethernet dpter port. v The node requires reboot for detecting the new SFP trnsceiver. The new SFP trnsceiver will be opertionl only fter reboot (no DMP is produced). The 10 Gbps Ethernet dpter port contins no SFP trnsceiver since the lst reboot nd the correct 10 Gb SFP trnsceiver is instlled: System reboot is required for detecting the new SFP trnsceiver. Procedure Attempt ech of these ctions, in the following order, until the filure is fixed. 1. Ensure tht the Fibre Chnnel or 10G Ethernet cble is securely connected t ech end. 2. Replce the Fibre Chnnel or 10G Ethernet cble. 3. Replce the SFP trnsceiver for the filing port on the node. Note: The system is supported by both longwve SFP trnsceivers nd shortwve SFP trnsceivers. You must replce n SFP trnsceiver with the sme type of SFP trnsceiver. If the SFP trnsceiver to replce is longwve SFP trnsceiver, for exmple, you must provide suitble replcement. Removing the wrong SFP trnsceiver might result in loss of dt ccess. 4. Replce the Fibre Chnnel dpter or Fibre Chnnel over Ethernet dpter on the node. Ethernet iscsi host-link problems If you re hving problems ttching to the Ethernet hosts, your problem might be relted to the network, the system, or the host. Note: The system nd Host IP should be on the sme VLAN. Host nd system nodes should not hve sme subnet on different VLANs. For network problems, you cn ttempt ny of the following ctions: v Test your connectivity between the host nd system ports. v Try to ping the system from the host. v Ask the Ethernet network dministrtor to check the firewll nd router settings. Chpter 6. Dignosing problems 237

252 v Check tht the subnet msk nd gtewy re correct for the system host configurtion. Using the mngement GUI for system problems, you cn ttempt ny of the following ctions: v View the configured node port IP ddresses. v View the list of volumes tht re mpped to host to ensure tht the volume host mppings re correct. v Verify tht the volume is online. For host problems, you cn ttempt ny of the following ctions: v Verify tht the host iscsi qulified nme (IQN) is correctly configured. v Use operting system utilities (such s Windows device mnger) to verify tht the device driver is instlled, loded, nd operting correctly. v If you configured the VLAN, check tht its settings re correct. Ensure tht Host Ethernet port, system Ethernet ports IP ddress, nd Switch port re on the sme VLAN ID. Ensure tht on ech VLAN, different subnet is used. Configuring the sme subnet on different VLAN IDs cn cuse network connectivity problems. Fibre Chnnel over Ethernet host-link problems Problems ttching to the Fibre Chnnel over Ethernet (FCoE) hosts might be relted to the network, the system, or the host. Before you begin If error code 705 on node is displyed, this code mens the Fibre Chnnel (FC) I/O port is inctive. Fibre Chnnel over Ethernet (FCoE) uses Fibre Chnnel (FC) s protocol nd Ethernet s n inter-connect. Note: Concerning Fibre Chnnel over Ethernet (FCoE) enbled port: either the Fibre Chnnel forwrder (FCF) is not seen, or the Fibre Chnnel over Ethernet (FCoE) feture is not configured on switch. v Verify tht the Fibre Chnnel over Ethernet (FCoE) feture is enbled on the Fibre Chnnel forwrder (FCF). v Verify the remote port (switch port) properties on the Fibre Chnnel forwrder (FCF). If you connect the host through Converged Enhnced Ethernet (CEE) Switch: v Test your connectivity between the host nd Converged Enhnced Ethernet (CEE) Switch. v Ask the Ethernet network dministrtor to check the firewll nd router settings to verify the settings. Run lsfbric, nd verify tht the host is seen s remote port in the output. If the host is not seen, in order: v Verify tht system nd host get Fibre Chnnel ID (FCID) on the Fibre Chnnel forwrder (FCF). If unble to verify, check the VLAN configurtion. v Verify tht system nd host port re prt of zone nd tht zone is in force. v Verify tht the volumes re mpped to host nd re online. For more informtion, see lshostvdiskmp nd lsvdisk in the description in the IBM Knowledge Center. 238 SAN Volume Controller: Troubleshooting Guide

253 Wht to do next Servicing storge systems If the problem is not resolved, verify the stte of the host dpter. v Unlod nd lod the device driver. v Use the operting system utilities (for exmple, Windows Device Mnger) to verify tht the device driver is instlled, loded, nd operting correctly. Storge systems tht re supported for ttchment to the system re designed with redundnt components nd ccess pths to enble concurrent mintennce. Hosts hve continuous ccess to their dt during component filure nd replcement. The following ctegories represent the types of service ctions for storge systems: v Controller code updte v Field replceble unit (FRU) replcement Controller code updte Ensure tht you re fmilir with the following guidelines for updting controller code: v Check to see if the system supports concurrent mintennce for your storge system. v Allow the storge system to coordinte the entire updte process. v If it is not possible to llow the storge system to coordinte the entire updte process, complete the following steps: 1. Reduce the storge system worklod by 50%. 2. Use the configurtion tools for the storge system to mnully fil over ll logicl units (LUs) from the controller tht you wnt to updte. 3. Updte the controller code. 4. Restrt the controller. 5. Mnully filbck the LUs to their originl controller. 6. Repet for ll controllers. FRU replcement Ensure tht you re fmilir with the following guidelines for replcing FRUs: v If the component tht you wnt to replce is directly in the host-side dt pth (for exmple, cble, Fibre Chnnel port, or controller), disble the externl dt pths to prepre for updte. To disble externl dt pths, disconnect or disble the pproprite ports on the fbric switch. The system ERPs reroute ccess over the lternte pth. v If the component tht you wnt to replce is in the internl dt pth (for exmple, cche, or drive) nd did not completely fil, ensure tht the dt is bcked up before you ttempt to replce the component. v If the component tht you wnt to replce is not in the dt pth (for exmple, uninterruptible power supply units, fns, or btteries), the component is generlly dul-redundnt nd cn be replced without more steps. Chpter 6. Dignosing problems 239

254 240 SAN Volume Controller: Troubleshooting Guide

255 Chpter 7. Disster recovery Use these disster recovery solutions for HyperSwp, Metro Mirror, Globl Mirror, nd Stretched System, where ccess to storge is still possible fter the filure of site. HyperSwp Active-ctive volume ccess is lwys provided while there is n up-to-dte consistent copy. If there is n out-of-dte consistent copy, there is not n utomtic filover to it, nor is red-only ccess given to it. Use the stoprcreltionshipccess or stoprcconsistgrp-ccess commnd to mke it ccessible. The reltionship is then in the Idling stte. After you enble ccess with the stoprcreltionship-ccess or stoprcconsistgrp-ccess commnd, use the strtrcreltionship -primry <mster/ux> or strtrcconsistgrp -primry <mster/ux> commnd to mke the reltionship leve the Idling stte nd resume HyperSwp repliction. If you previously rn overridequorum, the strtrcreltionship or strtrcconsistgrp commnd fils. When you resume HyperSwp repliction, consider whether you wnt to continue by using the out-of-dte consistent copy or revert to the up-to-dte copy. To identify whether the mster or uxiliry volume hs ccess, look t the primry field tht is shown by the lsrcreltionship or lsrcconsistgrp commnd. To continue to use the out-of-dte copy, provide tht vlue s the rgument to the -primry prmeter of the strtrcreltionship or strtrcconsistgrp commnd. To revert to the up-to-dte copy, specify the opposite vlue s the rgument to the -primry prmeter. For exmple, if mster is shown in the primry field of lsrcconsistgrp for n ctive-ctive consistency group in the Idling stte, to revert to the up-to-dte copy, use strtrcconsistgrp -primry ux. Metro Mirror nd Globl Mirror Note: Inpproprite use of these procedures cn llow host systems to mke independent modifictions to both the primry nd secondry copies of dt. You re responsible for ensuring tht no host systems re continuing to use the primry copy of the dt before you enble ccess to the secondry copy. In Metro Mirror or Globl Mirror configurtion system is configured t ech site. Reltionships re configured between the systems to mirror dt from storge t the primry site to storge t the secondry site. If n outge occurs t the secondry site the primry site continues opertion without ny intervention. If n outge occurs t the primry site, then it is necessry to enble ccess to storge t the secondry site. Use the stoprcreltionship-ccess or stoprcconsistgrp-ccess commnd to enble ccess to the storge t the secondry site. Stretched System In stretched system (formerly split-site) configurtion, system is configured with hlf the nodes t ech site nd quorum device t third loction. If n outge occurs t either site, then the other nodes t the other site ccess the quorum device nd continue opertion without ny intervention. If connectivity Copyright IBM Corp. 2003,

256 between the two sites is lost, then whichever nodes ccess the quorum device first continues opertion. For disster recovery purposes you might wnt to enble ccess to the storge t the site tht lost the rce to ccess the quorum device. Use the stsk overridequorum commnd to enble ccess to the storge t the secondry site. This feture is only vilble if the system ws configured by ssigning sites to nodes nd storge controllers, nd chnging the system topology to stretched. Importnt: If you run disster recovery on one site nd then power up the remining, filed site (which contined the configurtion node t the time of the disster), the cluster sserts itself s designed. This procedure would strt second, identicl cluster in prllel, which cn cuse dt corruption.you must follow these steps: Exmple 1. Remove the connectivity of the nodes from the site tht is experiencing the outge. 2. Power up or recover those nodes. 3. Run stsk levecluster-force or svctsk rmnode commnd for ll the nodes in the cluster. 4. Bring the nodes into cndidte stte. 5. Connect them to the site on which the site disster recovery feture ws run. Other configurtions To recover ccess to the storge in other configurtions, use Recover system procedure on pge SAN Volume Controller: Troubleshooting Guide

257 Chpter 8. Recovery procedures Recover system procedure This topic describes these recovery procedures: recover system nd bck up nd restore system configurtion. This topic lso contins informtion bout performing the node rescue. The recover system procedure recovers the entire system if the system stte is lost from ll nodes. The procedure re-cretes the system by using sved configurtion dt. The sved configurtion dt is in the ctive quorum disk nd the ltest XML configurtion bckup file. The recovery might not be ble to restore ll volume dt. This procedure is lso known s Tier 3 (T3) recovery. CAUTION: If the system encounters stte where: v No nodes re ctive Do not ttempt to initite node rescue (which the user cn initite either by using the SAN Volume Controller front pnel, the service ssistnt GUI, or the stsk rescuenode service CLI commnd). STOP nd contct IBM Remote Technicl Support. Inititing this T3 recover system procedure while in this specific stte cn result in loss of the XML configurtion bckup files. Attention: v Run service ctions only when directed by the fix procedures. If used inppropritely, service ctions cn cuse loss of ccess to dt or even dt loss. Before you ttempt to recover system, investigte the cuse of the filure nd ttempt to resolve those issues by using other fix procedures. Red nd understnd ll of the instructions before you complete ny ction. v The recovery procedure cn tke severl hours if the system uses lrge-cpcity devices s quorum devices. Do not ttempt the recover system procedure unless the following conditions re met: v All of the conditions re met in When to run the recover system procedure on pge 244. v All hrdwre errors re fixed. See Fix hrdwre errors on pge 245 v All nodes hve cndidte sttus. Otherwise, see step 1. v All nodes must be t the sme level of code tht the system hd before the filure. If ny nodes were modified or replced, use the service ssistnt to verify the levels of code, nd where necessry, to reinstll the level of code so tht it mtches the level tht is running on the other nodes in the system. For more informtion, see Removing system informtion for nodes with error code 550 or error code 578 using the service ssistnt on pge 246. The system recovery procedure is one of severl tsks tht must be completed. The following list is n overview of the tsks nd the order in which they must be completed: 1. Prepring for system recovery. Review the informtion bout when to run the recover system procedure. Copyright IBM Corp. 2003,

258 b. Fix your hrdwre errors nd mke sure tht ll nodes in the system re shown in service ssistnt or in the output from sinfo lsservicenodes. c. Remove the system informtion for nodes with error code 550 or error code 578 by using the service ssistnt, but only if the recommended user response for these node errors re followed. d. For Virtul Volumes (VVols), shut down the services for ny instnces of Spectrum Control Bse tht re connecting to the system. Use the Spectrum Control Bse commnd service ibm_spectrum_control stop. e. Remove hot spre nodes from the system nd set them into cndidte mode before strting the recovery process. Run the following CLI commnd to remove the node from the system. stsk levecluster -force spre-node-pnel-nme Once the node returns in service mode, run the following CLI commnd to set it into cndidte mode. stsk stopservice spre-node-pnel-nme 2. Running the system recovery. After you prepred the system for recovery nd met ll the pre-conditions, run the system recovery. Note: Run the procedure on one system in fbric t time. Do not run the procedure on different nodes in the sme system. This restriction lso pplies to remote systems. 3. Completing ctions to get your environment opertionl. v v Recovering from offline volumes by using the CLI. Checking your system, for exmple, to ensure tht ll mpped volumes cn ccess the host. When to run the recover system procedure Attempt recover procedure only fter complete nd thorough investigtion of the cuse of the system filure. Attempt to resolve those issues by using other service procedures. Attention: If you experience filures t ny time while running the recover system procedure, cll the IBM remote technicl support. Do not ttempt to do further recovery ctions, becuse these ctions might prevent support from restoring the system to n opertionl sttus. Certin conditions must be met before you run the recovery procedure. Use the following items to help you determine when to run the recovery procedure: 1. All enclosures nd externl storge systems re powered up nd cn communicte with ech other. 2. Check tht ll nodes in the system re shown in the service ssistnt tool or using the service commnd: sinfo lsservicenodes. Investigte ny missing nodes. 3. Check tht no node in the system is ctive nd tht the mngement IP is not ccessible. If ny node hs ctive sttus, it is not necessry to recover the system. 4. Resolve ll hrdwre errors in nodes so tht only node errors 578 or 550 re present. If this is not the cse, go to Fix hrdwre errors on pge Ensure ll bckend storge tht is dministered by the system is present before you run the recover system procedure. 244 SAN Volume Controller: Troubleshooting Guide

259 6. If ny nodes hve been replced, ensure tht the WWNN of the replcement node mtches tht of the replced node, nd tht no prior system dt remins on this node. Fix hrdwre errors Before running system recovery procedure, it is importnt to identify nd fix the root cuse of the hrdwre issues. Identifying nd fixing the root cuse cn help recover system, if these re the fults tht re cusing the system to fil. The following re common issues tht cn be esily resolved: v The node is powered off or the power cords were unplugged. v Check the node sttus of every node tht is member of the system. Resolve ll errors. All nodes must be reporting either node error 578, or no cluster nme is shown on the Cluster: disply. These error codes indicte tht the system lost its configurtion dt. If ny nodes report nything other thn these error codes, do not perform recovery. You cn encounter situtions where non-configurtion nodes report other node errors, such s node error 550. The 550 error cn lso indicte tht node is not ble to join system. Note: If ny of the buttons on the front pnel re pressed fter these two error codes re reported, the report for the node returns to the 578 node error. The chnge in the report hppens fter pproximtely 60 seconds. Also, if the node ws rebooted or if hrdwre service ctions were tken, the node might show no cluster nme on the Cluster: disply. If ny nodes show Node Error: 550, record the dt from the second line of the disply. If the lst chrcter on the second line of the disply is >, use the right button to scroll the disply to the right. - In ddition to the Node Error: 550, the second line of the disply cn show list of node front pnel IDs (7 digits) tht re seprted by spces. The list cn lso show the WWPN/LUN ID (16 hexdeciml digits followed by forwrd slsh nd deciml number). - If the error dt contins ny front pnel IDs, ensure tht the node referred to by tht front pnel ID is showing Node Error 578:. If it is not reporting node error 578, ensure tht the two nodes cn communicte with ech other. Verify the SAN connectivity nd restrt one of the two nodes by pressing the front pnel power button twice. - If the error dt contins WWPN/LUN ID, verify the SAN connectivity between this node nd tht WWPN. Check the storge system to ensure tht the LUN referred to is online. After verifying, restrt the node by pressing the front pnel power button twice. Note: If (fter you resolve ll these scenrios) hlf or greter thn hlf of the nodes re reporting Node Error: 578, it is pproprite to run the recovery procedure. For ny nodes tht re reporting node error 550, ensure tht ll the missing hrdwre tht is identified by these errors is powered on nd connected without fults. If you re not ble to restrt the system, nd if ny node other thn the current node is reporting node error 550 or 578, you must remove system dt from those nodes. This ction cknowledges the dt loss nd puts the nodes into the required cndidte stte. Chpter 8. Recovery procedures 245

260 Removing system informtion for nodes with error code 550 or error code 578 using the service ssistnt The system recovery procedure works only when ll nodes in the system of nodes to be recovered re in cndidte sttus. If there re ny nodes tht disply error code 550 or error code 578, you must remove their system dt. About this tsk Before performing this tsk, ensure tht you hve red the introductory informtion in the overll recover system procedure. Hving used the service ssistnt to identify the system sttus nd specific error, you will continue to use the service ssistnt to complete this procedure. Selecting Chnge Node in the service ssistnt tool lists ll of the Spectrum Virtulize nodes tht hve logged in to the node tht is running the tool. Follow these guidelines when performing the recovery procedure: v The system column of the node tble identifies ny nodes tht re not in the system of nodes tht must be recovered. Do not remove the system dt for these nodes. v Do not remove system informtion from ny node tht hs online sttus, unless directed to do so by remote technicl support. v Do not remove the system dt from the first node until you ensure tht the following conditions re met: All nodes in the system of nodes re listed in the Chnge Node prt of the service ssistnt nd re in service sttus with error 550 or 578 You hve checked the extr node error dt for ech node to ensure tht no other communiction or hrdwre problem is cusing the node error. Procedure 1. In the chnge node prt of the service ssistnt tool, select the rdio button of the node with sttus service nd error 550 or Select Mnge System. 3. Click Remove System Dt. Note: Spre nodes do not go into the 878/578 stte tht ctive nodes do. As such, the Mnge System screen does not hve the Remove System Dt button for spre nodes. To remove system dt on spre nodes, ssh onto ny spre node nd run the following commnds. stsk levecluster -force stsk stopservice Filure to remove the cluster stte from the spre nodes results in the T3 filing, s the new cluster is unble to find the spre nodes s vilble cndidtes. 4. Confirm tht you wnt to remove the system dt when prompted. 5. Remove the system dt for the other nodes tht disply 550 or 578 error. All nodes previously in this system must hve node sttus of Cndidte nd hve no errors listed ginst them. 246 SAN Volume Controller: Troubleshooting Guide

261 6. Resolve ny hrdwre errors until the error condition for ll nodes in the system is None. 7. Ensure tht ll nodes in the system of nodes to be recovered disply sttus of cndidte. Results When ll nodes disply sttus of cndidte nd ll error conditions re None, you cn run the system recovery procedure. Running system recovery by using the service ssistnt You cn use the service ssistnt to strt recovery when ll nodes tht were members of the system re online nd re in cndidte sttus. If ny nodes disply error code 550 or 578, remove system informtion to plce them into cndidte sttus. Do not run the recovery procedure on different nodes in the sme system; this restriction includes remote systems. Before you begin Note: Ensure tht the web browser is not blocking pop-up windows. If it does, progress windows cnnot open. Before you begin this procedure, red the recover system procedure introductory informtion; see Recover system procedure on pge 243. About this tsk Attention: This service ction hs serious implictions if not completed properly. If t ny time n error is encountered not covered by this procedure, stop nd cll the support center. Run the recovery from ny nodes in the system; the nodes must not prticipte in ny other system. If the system hs USB encryption, run the recovery from ny node in the system tht hs USB flsh drive tht is inserted which contins the encryption key. If the system contins n encrypted cloud ccount tht uses USB encryption, USB flsh drive with the system mster key must be present in the configurtion node before the cloud ccount cn move to the online stte. This requirement is necessry when the system is powered down, nd then restrted. If the system hs key server encryption, note the following items before you proceed with the T3 recovery. v Run the recovery on node tht is ttched to the key server. The keys re fetched remotely from the key server. v Run the recovery procedure on node tht is not hrdwre tht is replced or node tht is rescued. All of the informtion tht is required for node to successfully fetch the key from the key server resides on the node's file system. If the contents of the node's originl file system re dmged or no longer exist (rescue node, hrdwre replcement, file system tht is corrupted, nd so on), then the recovery fils from this node. Chpter 8. Recovery procedures 247

262 If the system uses both USB nd key server encryption, providing either USB flsh drive or connection to the key server (only one is needed, but both will work lso) will unlock the system. If you use USB flsh drives to mnge encryption keys, the T3 recovery cuses the connection to cloud service provider to go offline if the USB flsh drive is not inserted into the system. To fix this issue, insert the USB flsh drive with the current keys into the system. If you use key servers to mnge encryption keys, the T3 recovery cuses the connection to cloud service provider to go offline if the key server is offline. To fix this issue, ensure tht the key server is online nd vilble during T3 recovery. If you use both key servers nd USB flsh drives to mnge encryption keys, the T3 recovery cuses the connection to cloud service provider to go offline if none of the key providers re vilble. To fix this issue, ensure tht either the key server is online or USB flsh drive is inserted into the system (only one is needed, but both will work lso) during T3 recovery. Note: Ech individul stge of the recovery procedure cn tke significnt time to complete, depending on the specific configurtion. Procedure 1. Point your browser to the service IP ddress of one of the nodes. If you do not know the IP ddress or if it ws not configured, configure the service ddress in the following wy: v Use the technicin port to connect to the service ssistnt nd configure service ddress on the node. 2. Log on to the service ssistnt. 3. Select Recover System from the nvigtion. 4. Follow the online instructions to complete the recovery procedure.. Click Prepre for Recovery. The system serches for the most recent bckup file nd scns quorum disk. If this step is successful, Preprtion Sttus: Prepre complete is displyed on the bottom of the pge. b. Verify the dte nd time of the lst quorum time. The time stmp must be less thn 30 minutes before the filure. The time stmp formt is YYYYMMDD hh:mm, where YYYY is the yer, MM is the month, DD is the dy, hh is the hour, nd mm is the minute. Attention: If the time stmp is not less thn 30 minutes before the filure, cll the support center. c. Verify the dte nd time of the lst bckup dte. The time stmp must be less thn 24 hours before the filure. The time stmp formt is YYYYMMDD hh:mm, where YYYY is the yer, MM is the month, DD is the dy, hh is the hour, nd mm is the minute. Attention: If the time stmp is not less thn 24 hours before the filure, cll the support center. Chnges tht re mde fter the time of this bckup dte might not be restored. d. If the quorum time nd bckup dte re correct, click Recover to recrete the system. 248 SAN Volume Controller: Troubleshooting Guide

263 Results Any one of the following ctegories of messges might be displyed: v T3 successful The volumes re bck online. Use the finl checks to get your environment opertionl gin. v T3 recovery completed with errors T3 recovery tht is completed with errors: One or more of the volumes re offline becuse fst write dt ws in the cche. To bring the volumes online, see Recovering from offline volumes using the CLI for detils. v T3 filed Cll the support center. Do not ttempt ny further ction. Verify tht the environment is opertionl by completing the checks tht re provided in Wht to check fter running the system recovery on pge 250. If ny errors re logged in the error log fter the system recovery procedure completes, use the fix procedures to resolve these errors, especilly the errors tht re relted to offline rrys. If the recovery completes with offline volumes, go to Recovering from offline volumes using the CLI. Recovering from offline volumes using the CLI If Tier 3 recovery procedure completes with offline volumes, then it is likely tht the dt tht is in the write-cche of the node cnisters is lost during the filure tht cused ll of the node cnisters to lose the block storge system cluster stte. You cn use the commnd-line interfce (CLI) to cknowledge tht there ws dt tht is lost from the write-cche nd bring the volume bck online to ttempt to del with the dt loss. About this tsk If you run the recovery procedure but there re offline volumes, you cn complete the following steps to bring the volumes bck online. Some volumes might be offline becuse of write-cche dt loss or metdt loss during the event tht led ll node cnisters to lose cluster stte. Any dt tht is lost from the write-cche cnnot be recovered. These volumes might need extr recovery steps fter the volume is brought bck online. Note: If you encounter errors in the event log fter you run the recovery procedure tht is relted to offline rrys, use the fix procedures to resolve the offline rry errors before you fix the offline volume errors. Exmple Complete the following steps to recover n offline volume fter the recovery procedure is completed: 1. Delete ll IBM FlshCopy function mppings nd Metro Mirror or Globl Mirror reltionships tht use the offline volumes. Chpter 8. Recovery procedures 249

264 2. If the volume is thin-provisioned volume, run the repirsevdiskcopy vdisk_nme vdisk_id commnd. This commnd brings the volume bck online so tht you cn ttempt to del with the dt loss. Note: If running the repirsevdiskcopy commnd does not strt the repir opertion, then use the recovervdisk commnd. 3. If the volume is not SE volume, run the recovervdiskbysystem commnd. This brings the volume bck online so tht you cn ttempt to del with the dt loss. 4. Refer to Wht to check fter running the system recovery for wht to do with volumes tht re corrupted by the loss of dt from the write-cche. 5. Re-crete ll FlshCopy mppings nd Metro Mirror or Globl Mirror reltionships tht use the volumes. Wht to check fter running the system recovery Severl tsks must be completed before you use the system. The recovery procedure re-cretes the old system from the quorum dt. However, some things cnnot be restored, such s cched dt or system dt mnging in-flight I/O. This ltter loss of stte ffects RAID rrys tht mnge internl storge. The detiled mp bout where dt is out of synchroniztion hs been lost, mening tht ll prity informtion must be restored, nd mirrored pirs must be brought bck into synchroniztion. Normlly this ction results in either old or stle dt being used, so only writes in flight re ffected. However, if the rry lost redundncy (such s syncing, degrded, or criticl RAID sttus) before the error tht requires system recovery, then the sitution is more severe. Under this sitution you need to check the internl storge: v Prity rrys re likely syncing to restore prity; they do not hve redundncy when this opertion proceeds. v Becuse there is no redundncy in this process, bd blocks might re creted where dt is not ccessible. v Prity rrys might be mrked s corrupted. This indictes tht the extent of lost dt is wider thn in-flight I/O; to bring the rry online, the dt loss must be cknowledged. v RAID6 rrys tht were degrded before the system recovery might require full restore from bckup. For this reson, it is importnt to hve t lest cpcity mtch spre vilble. Be wre of these differences bout the recovered configurtion: v FlshCopy mppings re restored s idle_or_copied with 0% progress. Both volumes must re restored to their originl I/O groups. v The mngement ID is different. Any scripts or ssocited progrms tht refer to the system-mngement ID of the clustered system (system) must be chnged. v Any FlshCopy mppings tht were not in the idle_or_copied stte with 100% progress t the point of disster hve inconsistent dt on their trget disks. These mppings must be restrted. v Intersystem prtnerships nd reltionships re not restored nd must be re-creted mnully. v Consistency groups re not restored nd must be re-creted mnully. v Intrsystem Metro Mirror reltionships re restored if ll dependencies were successfully restored to their originl I/O groups. 250 SAN Volume Controller: Troubleshooting Guide

265 v Volumes with cloud snpshots tht were enbled before the recovery need to hve the cloud snpshots mnully reenbled. v If hrdwre ws replced before the recovery, the SSL certificte might not be restored. If it is not restored, then new self-signed certificte is generted with vlidity of 30 dys. Follow the ssocited Directed Mintennce Procedures (DMP) for permnent resolution. v The system time zone might not re restored. v Any Globl Mirror secondry volumes on the recovered system might hve inconsistent dt if there ws repliction I/O from the primry volume tht is cched on the secondry system t the point of the disster. A full synchroniztion is required when re-creting nd restrting these reltionships. v Immeditely fter the T3 recovery process runs, which re compressed disks do not know the correct vlue of their used cpcity. The disks initilly set the cpcity s the entire rel cpcity. When I/O resumes, the cpcity is shrunk down to the correct vlue. Similr behvior occurs when you use the -utoexpnd option on volumes. The rel cpcity of disk might increse slightly, cused by the sme kind of behvior tht ffects compressed volumes. Agin, the cpcity shrinks down s I/O to the disk is resumed. Before you use the volumes, complete the following tsks: v Strt the host systems. v Mnul ctions might be necessry on the hosts to trigger them to rescn for devices. You cn complete this tsk by disconnecting nd reconnecting the Fibre Chnnel cbles to ech host bus dpter (HBA) port. v Verify tht ll mpped volumes cn be ccessed by the hosts. v Run file system consistency checks. Note: Any dt tht ws in the system write cche t the time of the filure is lost. v Run the ppliction consistency checks. For Virtul Volumes (VVols), complete the following tsks. v After you confirm tht the T3 completed successfully, restrt Spectrum Control Bse (SCB) services. Use the Spectrum Control Bse commnd service ibm_spectrum_control strt. v Refresh the storge system informtion on the SCB GUI to ensure tht the systems re in sync fter the recovery. To complete this tsk, login to the SCB GUI. Hover over the ffected storge system, select the menu luncher, nd then select Refresh. This step repopultes the system. Repet this step for ll Spectrum Control Bse instnces. v Rescn the storge providers from within the vsphere Web Client. Select vcsa > Mnge > Storge Providers > select Active VP > Re-scn icon. For Virtul Volumes (VVols), lso be wre of the following informtion. FlshCopy mppings re not restored for VVols. The implictions re s follows. v The mppings tht describe the VM's snpshot reltionships re lost. However, the Virtul Volumes tht re ssocited with these snpshots still exist, nd the Chpter 8. Recovery procedures 251

266 snpshots might still pper on the vsphere Web Client. This outcome might hve implictions on your VMwre bck up solution. Do not ttempt to revert to snpshots. Use the vsphere Web Client to delete ny snpshots for VMs on VVol dt store to free up disk spce tht is being used unnecessrily. v The trgets of ny outstnding 'clone' FlshCopy reltionships might not function s expected (even if the vsphere Web Client recently reported clone opertions s complete). For ny VMs, which re trgets of recent clone opertions, complete the following tsks. Perform dt integrity checks s is recommended for conventionl volumes. If clones do not function s expected or show signs of corrupted dt, tke fresh clone of the source VM to ensure tht dt integrity is mintined. Bcking up nd restoring the system configurtion You cn bck up nd restore the configurtion dt for the system fter preliminry tsks re completed. Configurtion dt for the system provides informtion bout your system nd the objects tht re defined in it. The bckup nd restore functions of the svcconfig commnd cn bck up nd restore only your configurtion dt for the system. You must regulrly bck up your ppliction dt by using the pproprite bckup methods. You cn mintin your configurtion dt for the system by completing the following tsks: v Bcking up the configurtion dt v Restoring the configurtion dt v Deleting unwnted bckup configurtion dt files Before you bck up your configurtion dt, the following prerequisites must be met: Note: v The defult object nmes for controllers, I/O groups, nd mnged disks (MDisks) do not restore correctly if the ID of the object is different from wht is recorded in the current configurtion dt file. v All other objects with defult nmes re renmed during the restore process. The new nmes pper in the formt nme_r where nme is the nme of the object in your system. v Connections to iscsi MDisks for migrtion purposes re not restored. Before you restore your configurtion dt, the following prerequisites must be met: v The Security Administrtor role is ssocited with your user nme nd pssword. v You hve copy of your bckup configurtion files on server tht is ccessible to the system. v You hve bckup copy of your ppliction dt tht is redy to lod on your system fter the restore configurtion opertion is complete. v You know the current license settings for your system. 252 SAN Volume Controller: Troubleshooting Guide

267 v You did not remove ny hrdwre since the lst bckup of your system configurtion. If you hd to replce fulty node, the new node must use the sme worldwide node nme (WWNN) s the fulty node tht it replced. Note: You cn dd new hrdwre, but you must not remove ny hrdwre becuse the removl cn cuse the restore process to fil. v No zoning chnges were mde on the Fibre Chnnel fbric tht would prevent communiction between the system nd ny storge controllers tht re present in the configurtion. v You hve t lest 3 USB flsh drives if encryption ws enbled on the system when its configurtion ws bcked up. The USB flsh drives re used for genertion of new keys s prt of the restore process or for mnully restoring encryption if the system hs less thn 3 USB ports. Use the following steps to determine how to chieve n idel T4 recovery: v Open the pproprite svc.config.bckup.xml (or svc.config.cron.xml) file with suitble text editor or browser nd nvigte to the node section of the file. v For ech node entry, mke note of the vlue of the following properties: IO_group_id nd pnel_nme. v Use the CLI sinfo lsservicenodes commnd nd the dt to determine which nodes previously belonged in ech I/O group. Restoring the system configurtion must be performed by one of the nodes previously in I/O group zero. For exmple, property nme="io_group_id" vlue="0". The remining nodes must be dded, s required, in the pproprite order bsed on the previous IO_group_id of its nodes. The system nlyzes the bckup configurtion dt file nd the system to verify tht the required disk controller system nodes re vilble. Before you begin, hrdwre recovery must be complete. The following hrdwre must be opertionl: hosts, system nodes, nd expnsion enclosures (if pplicble), the Ethernet network, the SAN fbric, nd ny externl storge systems (if pplicble). Bcking up the system configurtion using the CLI You cn bck up your configurtion dt by using the commnd-line interfce (CLI). Before you begin Before you bck up your configurtion dt, the following prerequisites must be met: v No independent opertions tht chnge the configurtion cn be running while the bckup commnd is running. v No object nme cn begin with n underscore chrcter (_). About this tsk The bckup feture of the svcconfig CLI commnd is designed to bck up informtion bout your system configurtion, such s volumes, locl Metro Mirror informtion, locl Globl Mirror informtion, storge pools, nd nodes. All other Chpter 8. Recovery procedures 253

268 dt tht you wrote to the volumes is not bcked up. Any ppliction tht uses the volumes on the system s storge, must use the pproprite bckup methods to bck up its ppliction dt. You must regulrly bck up your configurtion dt nd your ppliction dt to void dt loss, such s fter ny significnt chnges to the system configurtion. Note: The system utomticlly cretes bckup of the configurtion dt ech dy t 1 AM. This bckup is known s cron bckup nd is written to /dumps/svc.config.cron.xml_seril# on the configurtion node. Use these instructions to generte mnul bckup t ny time. If severe filure occurs, both the configurtion of the system nd ppliction dt might be lost. The bckup of the configurtion dt cn be used to restore the system configurtion to the exct stte it ws in before the filure. In some cses, it might be possible to utomticlly recover the ppliction dt. This bckup cn be ttempted with the Recover System Procedure, lso known s Tier 3 (T3) procedure. To restore the system configurtion without ttempting to recover the ppliction dt, use the Restoring the System Configurtion procedure, lso known s Tier 4 (T4) recovery. Both of these procedures require recent bckup of the configurtion dt. Complete the following steps to bck up your configurtion dt: Procedure 1. Use your preferred bckup method to bck up ll of the ppliction dt tht you stored on your volumes. 2. Issue the following CLI commnd to bck up your configurtion: svcconfig bckup The following output is n exmple of the messges tht might be displyed during the bckup process: CMMVC6112W io_grp io_grp1 hs defult nme CMMVC6112W io_grp io_grp2 hs defult nme CMMVC6112W mdisk mdisk14... CMMVC6112W node node1... CMMVC6112W node node The svcconfig bckup CLI commnd cretes three files tht provide informtion bout the bckup process nd the configurtion. These files re creted in the /dumps directory of the configurtion node cnister. Tble 74 describes the three files tht re creted by the bckup process: Tble 74. Files creted by the bckup process File nme svc.config.bckup.xml_<seril#> svc.config.bckup.sh_<seril#> svc.config.bckup.log_<seril#> Description Contins your configurtion dt. Contins the nmes of the commnds tht were issued to crete the bckup of the system. Contins detils bout the bckup, including ny reported errors or wrnings. 254 SAN Volume Controller: Troubleshooting Guide

269 3. Check tht the svcconfig bckup commnd completes successfully, nd exmine the commnd output for ny wrnings or errors. The following output is n exmple of the messge tht is displyed when the bckup process is successful: CMMVC6155I SVCCONFIG processing completed successfully If the process fils, resolve the errors, nd run the commnd gin. 4. Keep bckup copies of the files outside the system to protect them ginst system hrdwre filure. Copy the bckup files off the system to secure loction; use either the mngement GUI or scp commnd line. For exmple: pscp -unsfe superuser@cluster_ip:/dumps/svc.config.bckup.* /offclusterstorge/ The cluster_ip is the IP ddress or DNS nme of the system nd offclusterstorge is the loction where you wnt to store the bckup files. Tip: To mintin controlled ccess to your configurtion dt, copy the bckup files to loction tht is pssword-protected. Restoring the system configurtion Use this procedure to restore the system configurtion in the following situtions: only if the recover system procedure fils or if the dt tht is stored on the volumes is not required. This procedure is lso known s Tier 4 (T4) recovery. For directions on the recover procedure, see Recover system procedure on pge 243. Before you begin This configurtion restore procedure is designed to restore informtion bout your configurtion, such s volumes, locl Metro Mirror informtion, locl Globl Mirror informtion, storge pools, nd nodes. The dt tht you wrote to the volumes is not restored. To restore the dt on the volumes, you must restore ppliction dt from ny ppliction tht uses the volumes on the clustered system s storge seprtely. Therefore, you must hve bckup of this dt before you follow the configurtion recovery process. If USB encryption ws enbled on the system when its configurtion ws bcked up, then t lest 3 USB flsh drives need to be present in the node USB ports for the configurtion restore to work. The 3 USB flsh drives must be inserted into the single node from which the configurtion restore commnds re run. Any USB flsh drives in other nodes (tht might become prt of the system) re ignored. If you re not recovering cloud bckup configurtion, the USB flsh drives do not need to contin ny keys. They re for genertion of new keys s prt of the restore process. If you re recovering cloud bckup configurtion, the USB flsh drives must contin the previous set of keys to llow the current encrypted dt to be unlocked nd reencrypted with the new keys. During T4 recovery, new system is creted with new certificte. If the system hs key server encryption, the new certificte must be exported by using the chsystemcert -export commnd, nd then instlled on ll key servers in the correct device group before you run the T4 recovery. The device group tht is used is the one in which the previous system ws defined. It might lso be necessry to get the new system's certificte signed. In T4 recovery, inform the key server dministrtor tht the ctive keys re considered compromised. Chpter 8. Recovery procedures 255

270 About this tsk You must regulrly bck up your configurtion dt nd your ppliction dt to void dt loss. If system is lost fter severe filure occurs, both configurtion for the system nd ppliction dt is lost. You must restore the system to the exct stte it ws in before the filure, nd then recover the ppliction dt. During the restore process, the nodes nd the storge enclosure re restored to the system, nd then the MDisks nd the rry re re-creted nd configured. If multiple storge enclosures re involved, the rrys nd MDisks re restored on the proper enclosures bsed on the enclosure IDs. Importnt: v There re two phses during the restore process: prepre nd execute. You must not chnge the fbric or system between these two phses. v For system tht contins nodes with more thn four Fibre Chnnel ports, the system loclfcportmsk nd prtnerfcportmsk settings re mnully repplied before you restore your dt. See step 8 on pge 258. v For system with nodes tht re connected to expnsion enclosures, ll nodes must be dded into the system before you restore your dt. See step 9 on pge 258. v For systems tht contin nodes tht re ttched to externl controllers virtulized by iscsi, ll nodes must be dded into the system before you restore your dt. Additionlly, the system cfgportip settings nd iscsi storge ports must be mnully repplied before you restore your dt. See step 10 on pge 258. v For VMwre vsphere Virtul Volumes (sometimes referred to s VVols) environments, fter T4 restortion, some of the Virtul Volumes configurtion steps re lredy completed: metdtvdisk creted, usergroup nd user creted, dminlun hosts creted. However, the user must then complete the lst two configurtion steps mnully (creting storge continer on IBM Spectrum Control Bse Edition nd creting virtul mchines on VMwre vcenter). v If the system hs USB encryption, run the recovery from ny node in the system tht hs USB flsh drive inserted which contins the encryption key. v If the system hs key server encryption, run the recovery on node tht is ttched to the key server. The keys re fetched remotely from the key server. v If the system uses both USB nd key server encryption, providing either USB flsh drive or connection to the key server (only one is needed, but both will work lso) will unlock the system. v For systems with cloud bckup configurtion, during T4 recovery the USB key tht contined the system mster key from the originl system must be inserted into the configurtion node of the new system. Alterntively, if key server is used, the key server must contin the system mster key from the originl system. If the originl system mster key is not vilble, nd the system dt is encrypted in the cloud provider, then the dt in the cloud is not ccessible. v If the system contins n encrypted cloud ccount tht is configured with both USB nd key server encryption, the mster keys from both need to be vilble t the time of T4 recovery. v If you use USB flsh drives to mnge encryption keys, the T4 recovery cuses the connection to cloud service provider to go offline if the USB flsh drive is not inserted into the system. To fix this issue, insert the USB flsh drive with the current keys into the system. 256 SAN Volume Controller: Troubleshooting Guide

271 v If you use key servers to mnge encryption keys, the T4 recovery cuses the connection to cloud service provider to go offline if the key server is offline. To fix this issue, ensure tht the key server is online nd vilble during T4 recovery. v If you use both key servers nd USB flsh drives to mnge encryption keys, the T4 recovery cuses the connection to cloud service provider to go offline if the key server is offline. To fix this issue, ensure tht both the key server is online nd USB flsh drive is inserted into the system during T4 recovery. v If the system contins n encrypted cloud ccount tht uses USB encryption, USB flsh drive with the system mster key must be present in the configurtion node before the cloud ccount cn move to the online stte. This requirement is necessry when the system is powered down, nd then restrted. v After T4 recovery, cloud ccounts re in n offline stte. It is necessry to re-enter the uthentiction informtion to bring the ccounts bck online. v After T4 recovery, volumes with cloud snpshots tht were enbled before the recovery need to hve the cloud snpshots mnully reenbled. If you do not understnd the instructions to run the CLI commnds, see the commnd-line interfce reference informtion. To restore your configurtion dt, follow these steps: Procedure 1. Verify tht ll nodes re vilble s cndidte nodes before you run this recovery procedure. You must remove errors 550 or 578 to put the node in cndidte stte. 2. Crete system. If possible, use the node tht ws originlly in I/O group 0. v For SAN Volume Controller 2145-DH8 nd SAN Volume Controller 2145-SV1 systems, use the technicin port. 3. In supported browser, enter the IP ddress tht you used to initilize the system nd the defult superuser pssword (pssw0rd). 4. Issue the following CLI commnd to ensure tht only the configurtion node is online: svcinfo lsnode The following output is n exmple of wht is displyed: id nme sttus IO_group_id IO_group_nme config_node 1 nodel online 0 io_grp0 yes 5. Using the commnd-line interfce, issue the following commnd to log on to the system: plink -i ssh_privte_key_file superuser@cluster_ip Where ssh_privte_key_file is the nme of the SSH privte key file for the superuser nd cluster_ip is the IP ddress or DNS nme of the system for which you wnt to restore the configurtion. Note: Becuse the RSA host key chnged, wrning messge might disply when you connect to the system by using SSH. 6. Identify the configurtion bckup file from which you wnt to restore. The file cn be either locl copy of the configurtion bckup XML file tht you sved when you bcked-up the configurtion or n up-to-dte file on one of the nodes. Configurtion dt is utomticlly bcked up dily t 01:00 system time on the configurtion node. Chpter 8. Recovery procedures 257

272 Downlod nd check the configurtion bckup files on ll nodes tht were previously in the system to identify the one contining the most recent complete bckup. From the mngement GUI, click Settings > Support > Support Pckge. b. Expnd Mnul Uplod Instructions nd select Downlod Support Pckge. c. On the Downlod New Support Pckge or Log File pge, select Downlod Existing Pckge. d. For ech node (cnister) in the system, complete the following steps: 1) Select the node to operte on from the selection box t the top of the tble. 2) Find ll the files with nmes tht mtch the pttern svc.config.*.xml*. 3) Select the files nd click Downlod to downlod them to your computer. e. If recent configurtion file is not present on this node, configure service IP ddresses for other nodes nd connect to the service ssistnt to look for configurtion files on other nodes. For more informtion, see the Service IPv4 or Service IPv6 options topic t Service IPv4 or Service IPv6 options. The XML files contin dte nd time tht cn be used to identify the most recent bckup. After you identify the bckup XML file tht is to be used when you restore the system, renme the file to svc.config.bckup.xml. 7. Copy onto the system the XML bckup file from which you wnt to restore. pscp full_pth_to_identified_svc.config.file superuser@cluster_ip:/tmp/svc.config.bckup.xml 8. If the system contins ny nodes with 10 GB interfce dpter or second Fibre Chnnel interfce dpter tht is instlled nd non-defult loclfcportmsk nd prtnerfcportmsk settings were previously configured, then mnully reconfigure these settings before you restore your dt. 9. If the system uses stretched or HyperSwp topology with nodes tht re t two sites, or if the system contins ny nodes with internl flsh drives (including nodes tht re connected to expnsion enclosures), these nodes must be dded to the system now. To dd these nodes, determine the pnel nme, node nme, nd I/O groups of ny such nodes from the configurtion bckup file. To dd the nodes to the system, run the following commnd: svctsk ddnode -pnelnme pnel_nme -iogrp iogrp_nme_or_id -nme node_nme Where pnel_nme is the nme tht is displyed on the pnel, iogrp_nme_or_id is the nme or ID of the I/O group to which you wnt to dd this node, nd node_nme is the nme of the node. 10. If the system contins ny iscsi storge controllers, these controllers must be detected mnully now. The nodes tht re connected to these controllers, the iscsi port IP ddresses, nd the iscsi storge ports must be dded to the system before you restore your dt.. To dd these nodes, determine the pnel nme, node nme, nd I/O groups of ny such nodes from the configurtion bckup file. To dd the nodes to the system, run the following commnd: svctsk ddnode -pnelnme pnel_nme -iogrp iogrp_nme_or_id -nme node_nme Where pnel_nme is the nme tht is displyed on the pnel, iogrp_nme_or_id is the nme or ID of the I/O group to which you wnt to dd this node, nd node_nme is the nme of the node. 258 SAN Volume Controller: Troubleshooting Guide

273 b. To restore iscsi port IP ddresses, use the cfgportip commnd. 1) To restore IPv4 ddress, determine id (port_id), node_id, node_nme, IP_ddress, msk, gtewy, host (0/1 stnds for no/yes), remote_copy (0/1 stnds for no/yes), nd storge (0/1 stnds for no/yes) from the configurtion bckup file, run the following commnd: svctsk cfgportip -node node_nme_or_id -ip ipv4_ddress -gw ipv4_gw -host yes no -remotecopy remote_copy_port_group_id -storge yes no port_id Where node_nme_or_id is the nme or id of the node, ipv4_ddress is the IP v4 version protocol ddress of the port, nd ipv4_gw is the IPv4 gtewy ddress for the port. 2) To restore IPv6 ddress, determine id (port_id), node_id, node_nme, IP_ddress_6, msk, gtewy_6, prefix_6, host_6 (0/1 stnds for no/yes), remote_copy_6 (0/1 stnds for no/yes), nd storge_6 (0/1 stnds for no/yes) from the configurtion bckup file, run the following commnd: svctsk cfgportip -node node_nme_or_id -ip_6 ipv6_ddress -gw_6 ipv6_gw -prefix_6 prefix -host_6 yes no -remotecopy_6 remote_copy_port_group_id -storge_6 yes no port_id Where node_nme_or_id is the nme or id of the node, ipv6_ddress is the IP v6 version protocol ddress of the port, ipv6_gw is the IPv6 gtewy ddress for the port, nd prefix is the IPv6 prefix. Complete steps b.i nd b.ii for ll (erlier configured) IP ports in the node_ethernet_portip_ip sections from the bckup configurtion file. c. Next, detect nd dd the iscsi storge port cndidtes by using the detectiscsistorgeportcndidte nd ddiscsistorgeport commnds. Mke sure tht you detect the iscsi storge ports nd dd these ports in the sme order s you see them in the configurtion bckup file. If you do not follow the correct order, it might result in T4 filure. Step c.i must be followed by steps c.ii nd c.iii. You must repet these steps for ll the iscsi sessions tht re listed in the bckup configurtion file exctly in the sme order. 1) To detect iscsi storge ports, determine src_port_id, IO_group_id (optionl, not required if the vlue is 255),trget_ipv4/trget_ipv6 (the trget IP tht is not blnk is required), iscsi_user_nme (not required if blnk), iscsi_chp_secret (not required if blnk), nd site (not required if blnk) from the configurtion bckup file, run the following commnd: svctsk detectiscsistorgeportcndidte -srcportid src_port_id -iogrp IO_group_id -trgetip/trgetip6 trget_ipv4/trget_ipv6 -usernme iscsi_user_nme -chpsecret iscsi_chp_secret -site site_id_or_n Where src_port_id is the source Ethernet port ID of the configured port, IO_group_id is the I/O group ID or nme being detected, trget_ipv4/trget_ipv6 is the IPv4/IPv6 trget iscsi controller IPv4/IPv6 ddress, iscsi_user_nme is the trget controller user nme being detected, iscsi_chp_secret is the trget controller chp secret being detected, nd site_id_or_nme is the specified id or nme of the site being detected. 2) Mtch the discovered trget_iscsinme with the trget_iscsinme for this prticulr session in the bckup configurtion file by running the lsiscsistorgeportcndidte commnd, nd use the mtching index to dd iscsi storge ports in step c.iii. Run the svcinfo lsiscsistorgeportcndidte commnd nd determine the id field of the row whose trget_iscsinme mtches with the trget_iscsinme from the configurtion bckup file. This is your cndidte_id to be used in step c.iii. 3) To dd the iscsi storge port, determine IO_group_id (optionl, not required if the vlue is 255), site (not required if blnk), iscsi_user_nme Chpter 8. Recovery procedures 259

274 (not required if blnk in bckup file), nd iscsi_chp_secret (not required if blnk) from the configurtion bckup file, provide the trget_iscsinme_index mtched in step c.ii, nd then run the following commnd: ddiscsistorgeport -iogrp iogrp_id -usernme iscsi_user_nme -chpsecret iscsi_chp_secr Where iogrp_id is the I/O group ID or nme tht is dded, iscsi_user_nme is the trget controller user nme tht is being dded, iscsi_chp_secret is the trget controller chp secret being dded, nd site_id_or_nme specified the ID or nme of the site being tht is dded. 4) If the configurtion is HyperSwp or stretched system, the controller nme nd site needs to be restored. To restore the controller nme nd site, determine ccontroller_nme nd controller site_id/nme from the bckup xml file by mtching the inter_wwpn field with the newly dded iscsi controller, nd then run the following commnd: chcontroller -nme controller_nme -site site_id/nme controller_id/nme Where controller_nme is the nme of the controller from the bckup xml file, site_id/nme is the ID or nme of the site of iscsi controller from the bckup xml file, nd controller_id/nme is the ID or current nme of the controller. 11. Issue the following CLI commnd to compre the current configurtion with the bckup configurtion dt file: svcconfig restore -prepre This CLI commnd cretes log file in the /tmp directory of the configurtion node. The nme of the log file is svc.config.restore.prepre.log. Note: It cn tke up to minute for ech 256-MDisk btch to be discovered. If you receive error messge CMMVC6200W for n MDisk fter you enter this commnd, ll the mnged disks (MDisks) might not be discovered yet. Allow suitble time to elpse nd try the svcconfig restore -prepre commnd gin. 12. Issue the following commnd to copy the log file to nother server tht is ccessible to the system: pscp superuser@cluster_ip:/tmp/svc.config.restore.prepre.log full_pth_for_where_to_copy_log_files 13. Open the log file from the server where the copy is now stored. 14. Check the log file for errors. v If you find errors, correct the condition tht cused the errors nd reissue the commnd. You must correct ll errors before you cn proceed to step 15. v If you need ssistnce, contct the support center. 15. Issue the following CLI commnd to restore the configurtion: svcconfig restore -execute This CLI commnd cretes log file in the /tmp directory of the configurtion node. The nme of the log file is svc.config.restore.execute.log. 16. Issue the following commnd to copy the log file to nother server tht is ccessible to the system: pscp superuser@cluster_ip:/tmp/svc.config.restore.execute.log full_pth_for_where_to_copy_log_files 17. Open the log file from the server where the copy is now stored. 18. Check the log file to ensure tht no errors or wrnings occurred. 260 SAN Volume Controller: Troubleshooting Guide

275 Note: You might receive wrning tht sttes tht licensed feture is not enbled. This messge mens tht fter the recovery process, the current license settings do not mtch the previous license settings. The recovery process continues normlly nd you cn enter the correct license settings in the mngement GUI lter. When you log in to the CLI gin over SSH, you see this output: IBM_2145:your_cluster_nme:superuser> Wht to do next You cn remove ny unwnted configurtion bckup nd restore files from the /tmp directory on your configurtion by issuing the following CLI commnd: svcconfig cler -ll Deleting bckup configurtion files using the CLI You cn use the commnd-line interfce (CLI) to delete bckup configurtion files. About this tsk Complete the following steps to delete bckup configurtion files: Procedure 1. Issue the following commnd to log on to the system: plink -i ssh_privte_key_file where ssh_privte_key_file is the nme of the SSH privte key file for the superuser nd cluster_ip is the IP ddress or DNS nme of the clustered system from which you wnt to delete the configurtion. 2. Issue the following CLI commnd to erse ll of the files tht re stored in the /tmp directory: svcconfig cler -ll Completing the node rescue when the node boots On SAN Volume Controller 2145-CG8 or 2145-CF8, you might hve to replce the hrd disk drive. Or, if the softwre on the hrd disk drive is corrupted, you cn use the node rescue procedure to reinstll the softwre cross the Fibre Chnnel fbric from its prtner node in the sme I/O group. Before you begin Similrly, if you replce the service controller, use the node rescue procedure to ensure tht the service controller hs the correct softwre. About this tsk Attention: If you recently replced both the service controller nd the disk drive s prt of the sme repir opertion, node rescue fils. Node rescue works by booting the operting system from the service controller nd running progrm tht copies ll the SAN Volume Controller softwre from ny other node tht cn be found on the Fibre Chnnel fbric. Chpter 8. Recovery procedures 261

276 Attention: When you run node rescue opertions, run only one node rescue opertion on the sme SAN, t ny one time. Wit for one node rescue opertion to complete before you strt nother. Perform the following steps to complete the node rescue: Procedure 1. Ensure tht the Fibre Chnnel cbles re connected. 2. Ensure tht t lest one other node is connected to the Fibre Chnnel fbric. 3. Ensure tht the SAN zoning llows connection between t lest one port of this node nd one port of nother node. It is better if multiple ports cn connect, which is importnt if the zoning is by worldwide port nme (WWPN) nd you re using new service controller. In this cse, you might need to use SAN monitoring tools to determine the WWPNs of the node. If you need to chnge the zoning, remember to set it bck when the service procedure is complete. 4. Turn off the node. 5. Press nd hold the left nd right buttons on the front pnel. 6. Press the power button. 7. Continue to hold the left nd right buttons until the node-rescue-request symbol is displyed on the front pnel (Figure 35). Results Figure 35. Node rescue disply The node rescue request symbol displys on the front pnel disply until the node strts to boot from the service controller. If the node rescue request symbol displys for more thn 2 minutes, go to the hrdwre boot MAP to resolve the problem. When the node rescue strts, the service disply shows the progress or filure of the node rescue opertion. Note: If the recovered node ws prt of clustered system, the node is now offline. Delete the offline node from the system nd then dd the node bck into the system. If node recovery ws used to recover node tht filed during softwre updte process, it is not possible to dd the node bck into the system until the code updte process completes. This process cn tke up to 4 hours for n eight-node clustered system. 262 SAN Volume Controller: Troubleshooting Guide

277 Chpter 9. Understnding the medium errors nd bd blocks Tble 75. Bd block errors Error code A storge system returns medium error response to host when it is unble to successfully red block. The system response to host red follows this behvior. The volume virtuliztion tht is provided extends the time when medium error is returned to host. Becuse of this difference to non-virtulized systems, the system uses the term bd blocks rther thn medium errors. The system lloctes volumes from the extents tht re on the mnged disks (MDisks). The MDisk cn be volume on n externl storge controller or RAID rry tht is creted from internl drives. In either cse, depending on the RAID level tht is used, there is normlly protection ginst red error on single drive. However, it is still possible to get medium error on red request if multiple drives hve errors or if the drives re rebuilding or re offline due to other issues. The system provides migrtion fcilities to move volume from one underlying set of physicl storge to nother or to replicte volume tht uses Metro Mirror or Globl Mirror. In ll these cses, the migrted volume or the replicted volume returns medium error to the host when the logicl block ddress on the originl volume is red. The system mintins tbles of bd blocks to record where the logicl block ddresses tht cnnot be red re. These tbles re ssocited with the MDisks tht re providing storge for the volumes. The dumpmdiskbdblocks commnd nd the dumpllmdiskbdblocks commnd re vilble to query the loction of bd blocks. Importnt: The dumpmdiskbdblocks outputs the virtul medium errors tht is creted, nd not list of the ctul medium errors on MDisks or drives. It is possible tht the tbles tht re used to record bd block loctions cn fill up. The tble cn fill either on n MDisk or on the system s whole. If tble does fill up, the migrtion or repliction tht ws creting the bd block fils becuse it ws not possible to crete n exct imge of the source volume. The system cretes lerts in the event log for the following situtions: v When it detects medium errors nd cretes bd block v When the bd block tbles fill up Tble 75 lists the bd block error codes. Description 1840 The mnged disk hs bd blocks. On n externl controller, this error must be copied medium error The system fils to crete bd block becuse the MDisk lredy hs the mximum number of llowed bd blocks. Copyright IBM Corp. 2003,

278 Tble 75. Bd block errors (continued) Error code Description 1225 The system fils to crete bd block becuse the system lredy hs the mximum number of llowed bd blocks. The recommended ctions for these lerts guide you in correcting the sitution. Cler bd blocks by dellocting the volume disk extent, by deleting the volume or by issuing write I/O to the block. It is good prctice to correct bd blocks s soon s they re detected. This ction prevents the bd block from being propgted when the volume is replicted or migrted. However, it is possible for the bd block to be on prt of the volume tht is not used by the ppliction. For exmple, it cn be in prt of dtbse tht is not initilized. These bd blocks re corrected when the ppliction writes dt to these res. Before the correction hppens, the bd block records continue to use up the vilble bd block spce. 264 SAN Volume Controller: Troubleshooting Guide

279 Chpter 10. Using the mintennce nlysis procedures MAP 5000: Strt The mintennce nlysis procedures (MAPs) inform you how to nlyze filure tht occurs with SAN Volume Controller node. About this tsk SAN Volume Controller nodes must be configured in pirs so you cn perform concurrent mintennce. When you service one node, the other node keeps the storge re network (SAN) opertionl. With concurrent mintennce, you cn remove, replce, nd test ll field replceble units (FRUs) on one node while the SAN nd host systems re powered on nd doing productive work. Note: Unless you hve prticulr reson, do not remove the power from both nodes unless instructed to do so. When you need to remove power, see MAP 5350: Powering off node on pge 276. Procedure v To isolte the FRUs in the filing node, complete the ctions nd nswer the questions tht re given in these mintennce nlysis procedures (MAPs). v When instructed to exchnge two or more FRUs in sequence: 1. Exchnge the first FRU in the list for new one. 2. Verify tht the problem is solved. 3. If the problem remins:. Reinstll the originl FRU. b. Exchnge the next FRU in the list for new one. 4. Repet steps 2 nd 3 until either the problem is solved, or ll the relted FRUs re exchnged. 5. Complete the next ction tht is indicted by the MAP. 6. If you re using one or more MAPs becuse of system error code, mrk the error s fixed in the event log fter the repir, but before you verify the repir. Note: Strt ll problem determintion procedures nd repir procedures with MAP 5000: Strt. MAP 5000: Strt is n entry point to the mintennce nlysis procedures (MAPs) for the system. Before you begin Note: The service ssistnt interfce must be used if there is no front pnel disply, for exmple on the SAN Volume Controller 2145-DH8. If you re not fmilir with these mintennce nlysis procedures (MAPs), first red Chpter 10, Using the mintennce nlysis procedures. Copyright IBM Corp. 2003,

280 This MAP pplies to ll system models. Be sure tht you know which model you re using before you strt this procedure. To determine which model you re working with, look for the lbel tht identifies the model type on the front of the node. You might be sent here for one of the following resons: v The fix procedures sent you here v A problem occurred during the instlltion of the system v Another MAP sent you here v A user observed problem tht ws not detected by the system System nodes re configured in pirs. While you service one node, you cn ccess ll the storge mnged by the pir from the other node. With concurrent mintennce, you cn remove, replce, nd test ll FRUs on one system while the SAN nd host systems re powered on nd doing productive work. Notes: v Unless you hve prticulr reson, do not remove the power from both nodes unless instructed to do so. v If n ction in these procedures involves removing or replcing prt, use the pplicble procedure. v If the problem persists fter you complete the ctions in this procedure, return to step 1 of the MAP to try gin to fix the problem. Procedure 1. Were you sent here from fix procedure? NO Go to step 2 YES Go to step 6 on pge (from step 1) Access the mngement GUI. See Accessing the mngement GUI on pge (from step 2) Does the mngement GUI strt? NO Go to step 6 on pge 267. YES Go to step (from step 3) Is the Welcome window displyed? NO Go to step 6 on pge 267. YES Go to step (from step 4) Log in to the mngement GUI. Use the user ID nd pssword tht is provided by the user. Go to the Events pge. Strt the fix procedure for the recommended ction. Did the fix procedures find n error tht is to be fixed? NO Go to step 6 on pge 267. YES Follow the fix procedures. 266 SAN Volume Controller: Troubleshooting Guide

281 6. (from steps 1 on pge 266, 3 on pge 266, 4 on pge 266, nd 5 on pge 266) Is the power indictor off? Check to see whether the power LED is off. NO Go to step 7. YES Try to turn on the nodes. Note: 7. (from step 6) SAN Volume Controller 2145-DH8 does not hve n externl uninterruptible power supply unit. This system hs bttery modules in its front pnel insted. If the nodes re turned on, go to step 7; otherwise, go to MAP 5040: Power SAN Volume Controller 2145-DH8 on pge 271. Does the node show hrdwre error? NO Go to step 8. YES 8. (from step 7) The service controller for the system filed. (The SAN Volume Controller 2145-DH8 does not hve service controller.). Check tht the service controller tht is indicting n error is correctly instlled. If it is, replce the service controller. b. Go to MAP 5700: Repir verifiction on pge 292. Is the opertor-informtion pnel error LED ( 4 in Figure 36 or 7 in Figure 37 on pge 268) illuminted or flshing? Or, is the check log LED ( 6 in Figure 37 on pge 268) illuminted or flshing? Figure 36 shows the opertor-informtion pnel for the SAN Volume Controller 2145-SV sv Power-control button nd power-on LED 2 Identify LED 3 Node sttus LED 4 Node fult LED 5 Bttery sttus LED Figure 36. SAN Volume Controller 2145-SV1 opertor-informtion pnel Figure 37 on pge 268 shows the opertor-informtion pnel for the SAN Volume Controller 2145-DH8. Chpter 10. Using the mintennce nlysis procedures 267

282 Power-control button nd power-on LED 2 Ethernet icon 3 System-loctor button nd LED 4 Relese ltch for the light pth dignostics pnel 5 Ethernet ctivity LEDs 6 Check log LED 7 System-error LED svc00824 Note: If the node hs more thn four Ethernet ports, ctivity for ports 5 nd bove is not indicted by the Ethernet ctivity LEDs on the opertor-informtion pnel. Figure 37. SAN Volume Controller 2145-DH8 opertor-informtion pnel NO Go to step 9. YES Go to MAP 5800: Light pth on pge (from step 8 on pge 267) For 2145-DH8 model, is the node sttus LED, node fult LED, nd bttery sttus LED tht you see in Figure 38 on pge 269 ll off? NO Go to step 11. YES Go to step (from step 9) For 2145-DH8, hs the node sttus LED, node fult LED, nd bttery sttus LED tht you see in Figure 38 on pge 269 ll been off for more thn 3 minutes? NO Go to step 11. YES 11. (from step 9 ) For 2145-DH8, go to step 20 on pge 270. Otherwise:. Go to Resolving problem with filure to boot in the IBM SAN Volume Controller Troubleshooting Guide b. Go to MAP 5700: Repir verifiction. Is the node fult LED ( 8 in Figure 38 on pge 269) on the front pnel of SAN Volume Controller 2145-DH8 on? Figure 38 on pge 269 shows the node fult LED. 268 SAN Volume Controller: Troubleshooting Guide

283 svc Node sttus LED 8 Node fult LED 9 Bttery sttus LED Figure 38. SAN Volume Controller 2145-DH8 front pnel NO Go to step 12. YES Complete these steps:. Access the service ssistnt interfce vi the Technicin port for node ccess nd follow the service recommendtion presented. b. Go to MAP 5700: Repir verifiction on pge (from step 11 on pge 268) Is Booting indicted on the node? NO Go to step 14. YES Go to step (from step 12) If the boot progress does not dvnce for more thn 3 minutes, the progress is stlled. Hs the boot progress stlled? NO Go to step 14. YES. Go to MAP 5700: Repir verifiction on pge (from step 12 nd step 13) Is the node fult LED, which is the middle of the three sttus LEDs on the front pnel of SAN Volume Controller 2145-DH8, on? Figure 38 shows the node fult LED. NO Go to step 15 on pge 270. YES Complete these steps:. Note the filure code nd go to Node error code overview on pge 146 to complete the repir ctions. b. If the node does not hve front pnel disply, ccess the service ssistnt interfce vi the Technicin port for node ccess nd follow the service recommendtion presented. Chpter 10. Using the mintennce nlysis procedures 269

284 c. Go to MAP 5700: Repir verifiction on pge (from step 14 on pge 269) Is Cluster Error reported on the node? NO Go to step 16. YES 16. (from step 15) A cluster error ws detected. This error code is displyed on ll the opertionl nodes in the system. The fix procedures normlly repir this type of error. Follow these steps:. Complete the error code repir ctions. b. Go to MAP 5700: Repir verifiction on pge 292. Is Powering Off, Restrting, Shutting Down, or Power Filure reported on the node? NO Go to step 17. YES 17. (from step 16) Wit for the opertion to complete nd then return to step 1 on pge 266 in this MAP. If the progress is stlled fter 3 minutes, press power nd go to step 17. Did the node power off? NO YES Complete the following steps:. Remove the power cord from the rer of the box. b. Wit 60 seconds. c. Replce the power cord. d. If the node does not power on, press power to power on the node nd then return to step 1 on pge 266 in this MAP. Complete the following steps:. Wit 60 seconds. b. Click power to turn on the node nd then return to step 1 on pge 266 in this MAP. Is there node tht is not member of clustered system? You cn tell if node is not member of system becuse the node sttus LED is off or blinking for SAN Volume Controller 2145-DH8. NO Go to step 19. YES The node is not member of system. The node might hve been deleted during mintennce procedure nd ws not dded bck into the system. Mke sure tht ech I/O group in the system contins two nodes. If n I/O group hs only one node, dd the node bck into tht system. Then, ensure tht the node is restored to the sme I/O group from which it ws deleted. No errors were detected by the system. If you suspect tht the problem tht is reported by the customer is hrdwre problem, follow these tsks:. Complete Problem Determintion procedures on your host systems, disk controllers, nd Fibre Chnnel switches. b. Ask IBM remote technicl support for ssistnce. 20. (from step 10 on pge 268) 270 SAN Volume Controller: Troubleshooting Guide

285 Cn you ccess the service ssistnt interfce through the 2145-DH8 technicin port or service IP ddress, or use USB flsh drive to get stsk_results.html? NO The system softwre might not be running. Attch USB keybord nd VGA monitor to the 2145-DH8 to see whether the node is stuck booting. YES Go to step (from step 20 on pge 270) Cn node error 561 be seen? NO YES Follow the recommended ction for ny node error tht cn be seen. The system softwre might not be ble to communicte with the bttery bckplne. Check the connections between the system bord nd the bttery bckplne. Then, follow the recommended ction for node error 561. Results If you suspect tht the problem is softwre problem, see Updting the system documenttion for detils bout how to updte your entire system environment. If the problem is still not fixed, collect dignostic informtion nd contct IBM Remote Technicl Support. MAP 5040: Power SAN Volume Controller 2145-DH8 It might become necessry to solve problems tht re ssocited with power on the SAN Volume Controller 2145-DH8. Before you begin If you re not fmilir with these mintennce nlysis procedures (MAPs), first red Chpter 10, Using the mintennce nlysis procedures, on pge 265. Power problems might be ssocited with ny of the following resons: v A problem occurred during the instlltion of SAN Volume Controller node v The power switch filed to turn on the node v The power switch filed to turn off the node v Another MAP sent you here Procedure 1. Are you here becuse the node is not powered on? NO Go to step 10 on pge 275. YES Go to step (from step 1) Is the power LED on the opertor-informtion pnel continuously illuminted? Figure 39 on pge 272 shows the loction of the power LED 1 on the opertor-informtion pnel. Chpter 10. Using the mintennce nlysis procedures 271

286 Power button nd power LED (green) ifs00064 Figure 39. Power LED on the SAN Volume Controller 2145-DH8 NO Go to step 3. YES The node is powered on correctly. Ressess the symptoms nd return to MAP 5000: Strt or go to MAP 5700: Repir verifiction to verify the correct opertion. 3. (from step 2 on pge 271) Is the power LED on the opertor-informtion pnel flshing pproximtely four times per second? NO Go to step 4. YES 4. (from step 3) The node is turned off nd is not redy to be turned on. Wit until the power LED flshes t rte of once per second, then go to step 5. If this behvior persists for more thn 3 minutes, complete the following procedure:. Remove ll input power from the SAN Volume Controller node by removing the power supply from the bck of the node. See Removing SAN Volume Controller 2145-DH8 power supply when you re removing the power cords from the node. b. Wit 1 minute nd then verify tht ll power LEDs on the node re extinguished. c. Reinsert the power supply. d. Wit for the flshing rte of the power LED to slow down to one flsh per second. Go to step 5. e. If the power LED keeps flshing t rte of four flshes per second for second time, replce the prts in the following sequence: v System bord Verify the repir by continuing with MAP 5700: Repir verifiction. Is the Power LED on the opertor-informtion pnel flshing once per second? YES The node is in stndby mode. Input power is present. Go to step 5. NO Go to step 6 on pge (from step 3 nd step 4) Press Power on the opertor-informtion pnel of the node. Is the Power LED on the opertor-informtion pnel illuminted solid green? 272 SAN Volume Controller: Troubleshooting Guide

287 NO YES Verify tht the opertor-informtion pnel cble is correctly seted t both ends. If the node still fils to power on, replce prts in the following sequence:. Opertor-informtion pnel ssembly b. System bord Verify the repir by continuing with MAP 5700: Repir verifiction. The power LED on the opertor-informtion pnel shows tht the node successfully powered on. Verify the correct opertion by continuing with MAP 5700: Repir verifiction. 6. (from step 4 on pge 272) Is the rer pnel power LED on or flshing?figure 40 shows the loction of the power LED 1 on the SAN Volume Controller 2145-DH Figure 40. Power LED indictor on the rer pnel of the SAN Volume Controller 2145-DH8 NO Go to step 7. YES 7. (from step 6) The opertor-informtion pnel is filing. Verify tht the opertor-informtion pnel cble is seted on the system bord. If the node still fils to power on, replce prts in the following sequence:. Opertor-informtion pnel ssembly b. System bord svc00574 Are the c LED indictors on the rer of the power supply ssemblies illuminted? Figure 41 on pge 274 shows the loction of the c LED 1, the dc LED 2, nd the power-supply error LED 3 on the rer of the power supply ssembly tht is on the rer pnel of the SAN Volume Controller 2145-DH8. Chpter 10. Using the mintennce nlysis procedures 273

288 1 AC LED (green) AC 2 DC LED (green) DC 3 Power-supply error LED (yellow) AC DC svc00794 Figure 41. AC, dc, nd power-supply error LED indictors on the rer pnel of the SAN Volume Controller 2145-DH8 NO Verify tht the input power cble or cbles re securely connected t both ends nd show no sign of dmge; replce dmged cbles. If the node still fils to power on, replce the specified prts tht re bsed on the SAN Volume Controller model type. Replce the SAN Volume Controller 2145-DH8 prts in the following sequence:. Power supply 750 W YES Go to step (from step 7 on pge 273) Is the power-supply error LED on the rer of the SAN Volume Controller 2145-DH8 power supply illuminted? Figure 41 shows the loction of the power-supply error LED 3. YES Replce the power supply unit. NO Go to step 9 9. (from step 8) Are the dc LED indictors on the rer of the power supply ssemblies illuminted? 274 SAN Volume Controller: Troubleshooting Guide

289 NO YES Replce the SAN Volume Controller 2145-DH8 prts in the following sequence:. Power supply 750 W b. System bord Verify tht the opertor-informtion pnel cble is correctly seted t both ends. If the node still fils to power on, replce prts in the following sequence:. Opertor-informtion pnel b. Cble, signl c. System bord Verify the repir by continuing with MAP 5700: Repir verifiction on pge (from step 1 on pge 271) The node does not power off immeditely when the power button is pressed. When the node fully boots, the node powers-off under the control of the SAN Volume Controller softwre. The power-off opertion cn tke up to 5 minutes to complete. Is the power LED on the opertor-informtion pnel flshing pproximtely four times per second? NO Go to step 11. YES Wit for the node to power off. If the node fils to power off fter 5 minutes, go to step (from step 10) Attention: Turning off the node by ny mens other thn using the mngement GUI might cuse loss of dt in the node cche. If you re performing concurrent mintennce, this node must be deleted from the system before you proceed. Ask the customer to delete the node from the system now. If they re unble to delete the node, cll your support center for ssistnce before you proceed. The node cnnot be turned off either becuse of softwre fult or hrdwre filure. Press nd hold the power button. The node cn turn off within 5 seconds. Did the node turn off? NO Determine whether you re using n Advnced Configurtion nd Power Interfce (ACPI) or non-acpi operting system. If you re using non-acpi operting system, complete the following steps: Press Ctrl+Alt+Delete. Turn off the server by pressing nd holding Power for 5 seconds. Restrt the server. If the server fils POST nd pressing Power does not work, disconnect the power cord for 20 seconds Reconnect the power cord nd restrt the server. If the problem remins or if you re using n ACPI-wre operting system, suspect the system bord. Go to step 12 on pge 276 Chpter 10. Using the mintennce nlysis procedures 275

290 YES Go to step (from step 11 on pge 275) Press the power button to turn on the node. Did the node turn on nd boot correctly? NO Go to MAP 5000: Strt on pge 265 to resolve the problem. YES Go to step (from step 12) MAP 5350: Powering off node The node probbly suffered softwre filure. Memory dump dt might be cptured tht helps resolve the problem. Cll your support center for ssistnce. MAP 5350: Powering off node helps you power off single node to complete service ction without disrupting host ccess to volumes. Before you begin If the solution is set up correctly, powering off single node does not disrupt the norml opertion of system. A system hs nodes in pirs clled I/O groups. An I/O group continues to hndle I/O to the disks it mnges with only single node tht is powered on. However, performnce degrdes nd resilience to error is reduced. Be creful when you power off system node to impct the system no more thn necessry. Note: If you do not follow the procedures tht re outlined here, your ppliction hosts might lose ccess to their dt or they might lose dt in the worst cse. You cn use the following preferred methods to power off node tht is member of system nd not offline: 1. Use the Power off option in the mngement GUI or in the service ssistnt interfce. 2. Use the CLI commnd stopsystem node nme. It is preferble to use either the mngement GUI or the commnd-line interfce (CLI) to power off node. These methods provide controlled hndover to the prtner node nd provide better resilience to other fults in the system. Only if node is offline or not member of system must you power it off using the power button. About this tsk To provide the lest disruption when you power off node, ll of the following conditions must pply: v The other node in the I/O group is powered on nd ctive in the system. v The other node in the I/O group hs SAN Fibre Chnnel connections to ll hosts nd disk controllers tht re mnged by the I/O group. v All volumes tht re hndled by this I/O group re online. v Host multipthing is online to the other node in the I/O group. 276 SAN Volume Controller: Troubleshooting Guide

291 In some circumstnces, the reson you power off the node might mke these conditions impossible. For instnce, if you replce filed Fibre Chnnel dpter, volumes do not show n online sttus. Use your judgment to decide tht it is sfe to proceed when condition is not met. Alwys check with the system dministrtor before you proceed with to power off nd tht might disrupt I/O ccess. The system dministrtor might prefer to wit for more suitble time or suspend host pplictions. To ensure smooth restrt, node must sve dt structures tht it cnnot re-crete to its locl, internl disk drive. The mount of dt the node sves to locl disk cn be high, so this opertion might tke severl minutes. Do not ttempt to interrupt the controlled power off. Attention: The following ctions do not llow the node to sve dt to its locl disk. Therefore, do not power off node by using the following methods: v Holding down the power button on the node (unless it is SAN Volume Controller 2145-SV1). When you press nd relese the power button, the node indictes this ction to the softwre so the node cn write its dt to locl disk before the node powers off. When you hold down the power button, the hrdwre interprets this ction s n emergency power off indiction nd shuts down immeditely. The hrdwre does not sve the dt to locl disk before you power down. The emergency power off occurs pproximtely 4 seconds fter you press nd hold down the power button. v Pressing the reset button on the light pth dignostics pnel. Importnt: If you power off SAN Volume Controller 2145-DH8 node nd might not power it bck on the sme dy, follow these steps to prevent the btteries from being dischrged too much while the node is connected to power but not powered on: 1. Pull both btteries out of the node. Keep them out until you're redy to power on the node. 2. Push the btteries in just before you press the power button to power on the node. If you disconnect the power from SAN Volume Controller 2145-DH8 node nd might not reconnect power to it gin within the next 24 hours, follow these steps to prevent the btteries from being dischrged too much while the node is not connected to power: 1. After both power cords re disconnected from the node, pull both btteries out of the node. This step completely turns off the bttery bckplne. 2. Push the btteries bck in gin. Using the mngement GUI to power off system Use the mngement GUI to power off system. Procedure To use the mngement GUI to power off system, complete the following steps: 1. Strt the mngement GUI for the system tht you re servicing. 2. Select Monitoring > System. Chpter 10. Using the mintennce nlysis procedures 277

292 If the nodes to power off re shown s Offline, the nodes re not prticipting in the system. In such circumstnces, use the power button on the offline nodes to power off the nodes. If the nodes to power off re shown s Online, powering off the nodes cn result in their dependent volumes lso going offline:. Select the node nd click Show Dependent Volumes. b. Mke sure the sttus of ech volume in the I/O group is Online. You might need to view more thn one pge. You might need to view more thn one pge. If ny volumes re Degrded, only one node in the I/O is processing I/O requests for tht volume. If tht node is powered off, it impcts ll the hosts tht re submitting I/O requests to the degrded volume. If ny volumes re degrded nd you believe tht it might be becuse the prtner node in the I/O group is powered off recently, wit until refresh of the screen shows ll volumes online. All the volumes must be online within 30 minutes of the prtner node tht is being powered off. Note: After you wit 30 minutes, if you hve degrded volume nd ll of the ssocited nodes nd MDisks re online, contct support for ssistnce. Ensure tht ll volumes tht re used by hosts re online before you continue. c. If possible, check tht ll hosts tht ccess volumes tht re mnged by this I/O group re ble to fil over to use pths tht re provided by the other node in the group. Complete this check by using the multipthing device driver softwre of the host system. Commnds to use differ, depending on the multipthing device driver tht is being used. If you use the System Storge Multipth Subsystem Device Driver (SDD), the commnd to query pths is dtpth query device. It cn tke some time for the multipthing device drivers to rediscover pths fter node is powered on. If you re unble to check on the host tht ll pths to both nodes in the I/O group re vilble, do not power off node within 30 minutes of the prtner node tht is being powered on or you might lose ccess to the volume. d. If you decide tht it is oky to continue with powering off the nodes, select the node to power off nd click Shut Down System. e. Click OK. If the node tht you select is the lst remining node tht provides ccess to volume, for exmple node tht contins flsh drives with unmirrored volumes, the Shutting Down Node-Force pnel is displyed with list of volumes tht go offline if the node is shut down. f. Check tht no host pplictions ccess the volumes tht re going offline. Continue with the shutdown only if the loss of ccess to these volumes is cceptble. To continue with shutting down the node, click Force Shutdown. Wht to do next During the shutdown procedure, the node sves its dt structures to its locl disk nd destges ll write dt tht is held in cche to the SAN disks. Such processing cn tke severl minutes. At the end of this processing, the system powers off. 278 SAN Volume Controller: Troubleshooting Guide

293 Using the system CLI to power off node Use the commnd-line interfce (CLI) to power off node. Procedure 1. Issue the lsnode CLI commnd to disply list of nodes in the system nd their properties. Find the node to shut down nd write down the nme of its I/O group. Confirm tht the other node in the I/O group is online. lsnode -delim : id:nme:ups_seril_number:wwnn:sttus:io_group_id: IO_group_nme:config_node: UPS_unique_id 1:group1node1:10L3ASH: C:online:0:io_grp0:yes: C0D18D8 2:group1node2:10L3ANF: :online:0:io_grp0:no: C0D1796 3:group2node1:10L3ASH: :online:1:io_grp1:no: C0D18D8 4:group2node2:10L3ANF: F4:online:1:io_grp1:no: C0D1796 If the node to power off is shown s Offline, the node is not prticipting in the system nd is not processing I/O requests. In such circumstnces, use the power button on the node to power off the node. If the node to power off is shown s Online, but the other node in the I/O group is not online, powering off the node impcts ll hosts tht re submitting I/O requests to the volumes tht re mnged by the I/O group. Ensure tht the other node in the I/O group is online before you continue. 2. Issue the lsdependentvdisks CLI commnd to list the volumes tht depend on the sttus of specified node. lsdependentvdisks group1node1 vdisk_id vdisk_nme 0 vdisk0 1 vdisk1 If the node goes offline or is removed from the system, the dependent volumes lso go offline. Before you tke node offline or remove it from the system, you cn use the commnd to ensure tht you do not lose ccess to ny volumes. 3. If you decide tht it is oky to continue powering off the node, enter the stopsystem node <nme> CLI commnd to power off the node. Use the node prmeter to void powering off the whole system: stopsystem node group1node1 Are you sure tht you wnt to continue with the shut down? yes Note: To shut down the node even though there re dependent volumes, dd the -force prmeter to the stopsystem commnd. The force prmeter forces continution of the commnd even though ny node-dependent volumes will be tken offline. Use the force prmeter with cution; ccess to dt on node-dependent volumes will be lost. During the shutdown procedure, the node sves its dt structures to its locl disk nd destges ll write dt tht is held in the cche to the SAN disks, which cn tke severl minutes. At the end of this process, the node powers off. Using the system power control button Do not use the power control button to power off node unless n emergency exists or nother procedure directs you to do so. Chpter 10. Using the mintennce nlysis procedures 279

294 Before you begin With this method, you cnnot check the system sttus from the front pnel, so you cnnot tell if the power off is lible to cuse excessive disruption to the system. Insted, use the mngement GUI or the CLI commnds, described in the previous topics to power off n ctive node. About this tsk If you must use this method, notice in Figure 42 nd Figure 43 tht ech model type hs power control button 1 on the front. 1 svc00716_new Figure 42. Power control button on the SAN Volume Controller 2145-DH8 model sv Figure 43. Power control button nd LED lights on the SAN Volume Controller 2145-SV1 model v 1 Power-control button nd power-on LED v 2 Identify LED v 3 Node sttus LED v 4 Node fult LED v 5 Bttery sttus LED When you determine it is sfe to do so, press nd immeditely relese the power button. On models other thn the 2145-DH8 nd 2145-SV1, the front pnel disply chnges to disply Powering Off nd displys progress br. Note: The 2145-DH8 nd 2145-SV1 do not hve front pnel disply, but sttus LED 2, 3, 4, nd 5 in Figure 43 ll turn off, nd the power-on LED 1 goes from on to flshing. Results The node sves its dt structures to disk while it is powering off. The power off process cn tke up to 5 minutes. 280 SAN Volume Controller: Troubleshooting Guide

295 MAP 5500: Ethernet When node is powered off by using the power button (or becuse of power filure), the prtner node in its I/O group immeditely stops using its cche for new write dt nd destges ny write dt lredy in its cche to the SAN-ttched disks. The destging durtion depends on the speed nd utiliztion of the disk controllers. The time to complete is less thn 15 minutes, but it might be longer. If dt is witing to be written to disk tht is offline, the destging cnnot complete. A node tht powers off nd restrts while its prtner node continues to process I/O might not be ble to become n ctive member of the I/O group immeditely. The node must wit until the prtner node completes destging the cche. If the prtner node powers off during this period, ccess to the SAN storge tht is mnged by this I/O group is lost. If one of the nodes in the I/O group is unble to service ny I/O, volumes tht re mnged by tht I/O group hve sttus of Degrded. For exmple, if the prtner node in the I/O group is still flushing its write cche, it hs sttus of Degrded. MAP 5500: Ethernet helps you solve problems tht hve occurred on the system Ethernet connections. Before you begin Note: The service ssistnt GUI should be used, for exmple on the SAN Volume Controller 2145-DH8. If you re not fmilir with these mintennce nlysis procedures (MAPs), first red Chpter 10, Using the mintennce nlysis procedures, on pge 265. If you encounter problems with the 10 Gbps Ethernet feture, see MAP 5550: 10G Ethernet nd Fibre Chnnel over Ethernet personlity enbled dpter port on pge 284. You might hve been sent here for one of the following resons: v A problem occurred during the instlltion of system nd the Ethernet checks filed v Another MAP sent you here v The customer needs immedite ccess to the system by using n lternte configurtion node. See Defining n lternte configurtion node on pge 284 About this tsk Complete the following steps: Procedure 1. Is ny node in the system reporting error code 805? YES Go to step 6 on pge 282. NO Go to step Is the system reporting error 1400 in the event log? Chpter 10. Using the mintennce nlysis procedures 281

296 YES Go to step 4. NO Go to step Are you experiencing Ethernet performnce issues? YES Go to step 9 on pge 283. NO Go to step 10 on pge (from step 2 on pge 281) On ll nodes, complete the following ctions:. Check Ethernet port 1. b. If Ethernet port 1 shows link offline, record this port s one tht requires fixing. c. If the system is configured with two Ethernet cbles per node, check Ethernet port 2 nd repet the previous step. d. Go to step (from step 4) Are ny Ethernet ports tht hve cbles ttched to them reporting link offline? YES Go to step 6. NO Go to step 10 on pge (from step 5) Do the system nodes hve one or two cbles connected? One Go to step 7. Two Go to step (from step 6) Complete the following ctions:. Plug the Ethernet cble from tht node into the Ethernet port 2 from different node, s shown in Figure 44. b. If the Ethernet link light is illuminted when the cble is plugged into Ethernet port 2 of the other node, replce the system bord of the originl node svc00861 Figure 44. Ethernet ports on the rer of the SAN Volume Controller 2145-DH8 1 1 Gbps Ethernet port Gbps Ethernet port Gbps Ethernet port 3 c. If the Ethernet link light does not illuminte, check the Ethernet switch or hub port nd cble to resolve the problem. d. Verify the repir by continuing with MAP 5700: Repir verifiction on pge (from step 5 or step 6) Complete the following ctions:. Plug the Ethernet cble from tht node into nother device, for exmple, the SSPC. 282 SAN Volume Controller: Troubleshooting Guide

297 b. If the Ethernet link light is illuminted when the cble is plugged into the other Ethernet device, replce the system bord of the originl node. c. If the Ethernet link light does not illuminte, check the Ethernet switch/hub port nd cble to resolve the problem. d. Verify the repir by continuing with MAP 5700: Repir verifiction on pge (from step 3 on pge 282) Complete the following ctions:. Check ll Speed port 1 nd Speed port 2 pnels for the speed nd duplex settings. The formt is: <Speed>/<Duplex>. 1) Check Speed 1. 2) If Speed 1 shows link offline, record this port s one tht requires fixing. 3) If the system is configured with two Ethernet cbles per node, check Speed 2 nd repet the previous step. b. Check tht the system port hs negotited t the highest speed vilble on the switch. All nodes hve gigbit Ethernet network ports. c. If the Duplex setting is hlf, complete the following steps: 1) There is known problem with gigbit Ethernet when one side of the link is set to fixed speed nd duplex nd the other side is set to utonegotite. The problem cn cuse the fixed side of the link to run t full duplex nd the negotited side of the link to run t hlf duplex. The duplex mismtch cn cuse significnt Ethernet performnce degrdtion. 2) If the switch is set to full duplex, set the switch to utonegotite to prevent the problem described previously. 3) If the switch is set to hlf duplex, set it to utonegotite to llow the link to run t the higher bndwidth vilble on the full duplex link. d. If none of the bove re true, cll your support center for ssistnce. 10. (from step 2 on pge 281) A previously reported fult with the Ethernet interfce is no longer present. A problem with the Ethernet might hve been fixed, or there might be n intermittent problem. Check with the customer to determine tht the Ethernet interfce hs not been intentionlly disconnected. Also check tht there is no recent history of fixed Ethernet problems with other components of the Ethernet network. Is the Ethernet filure explined by the previous checks? NO YES There might be n intermittent Ethernet error. Complete these steps in the following sequence until the problem is resolved:. Use the Ethernet hub problem determintion procedure to check for nd resolve n Ethernet network connection problem. If you resolve problem, continue with MAP 5700: Repir verifiction on pge 292. b. Determine if similr Ethernet connection problems hve occurred recently on this node. If they hve, replce the system bord. c. Verify the repir by continuing with MAP 5700: Repir verifiction on pge 292. Verify the repir by continuing with MAP 5700: Repir verifiction on pge 292. Chpter 10. Using the mintennce nlysis procedures 283

298 Defining n lternte configurtion node A sitution cn rise where the customer needs immedite ccess to the system by using n lternte configurtion node. About this tsk If ll Ethernet connections to the configurtion node hve filed, the system is unble to report filure conditions, nd the mngement GUI is unble to ccess the system to complete dministrtive or service tsks. If this is the cse nd the customer needs immedite ccess to the system, you cn mke the system use n lternte configurtion node by using the service ssistnt GUI. The service ssistnt is ccessed vi the technicin port.. Note: If the system hs no front pnel disply such s on SAN Volume Controller 2145-DH8, use the service ssistnt GUI. The service ssistnt is ccessed vi the technicin port. If only one node is reporting Node Error 805, complete the following steps: Procedure 1. Press nd relese the power button on the node tht is reporting Node Error When Powering off is displyed, press the power button gin. 3. Restrting is displyed. Results The system will select new configurtion node. The mngement GUI is ble to ccess the system gin. MAP 5550: 10G Ethernet nd Fibre Chnnel over Ethernet personlity enbled dpter port MAP 5550: 10G Ethernet helps you solve problems tht occur on node with 10G Ethernet cpbility, nd Fibre Chnnel over Ethernet personlity enbled. Before you begin Note: The service ssistnt GUI might be used if there is no front pnel disply, for exmple on the SAN Volume Controller 2145-DH8. If you re not fmilir with these mintennce nlysis procedures (MAPs), first red Chpter 10, Using the mintennce nlysis procedures, on pge 265. This MAP pplies to system models with the 10G Ethernet feture instlled. Be sure tht you know which model you re using before you strt this procedure. To determine which model you re working with, look for the lbel tht identifies the model type on the front of the node. Check tht the 10G Ethernet dpter is instlled nd tht n opticl cble is ttched to ech port. If you experience problem with error code 805, go to MAP 5500: Ethernet on pge SAN Volume Controller: Troubleshooting Guide

299 If you experience problem with error code 703 or 723, go to Fibre Chnnel nd 10G Ethernet link filures on pge 236. You might be sent here for one of the following resons: v A problem occurred during the instlltion of system nd the Ethernet checks fil. v Another MAP sent you to this loction. About this tsk Perform the following steps: Procedure 1. Is node error 720 or 721 displyed on the front pnel of the ffected node or is service error code 1072 shown in the event log? YES Go to step 11 on pge 286. NO Go to step (from step 1) Perform the following ctions from the front pnel of the ffected node:. Press nd relese the up or down button until Ethernet is shown. b. Press nd relese the left or right button until Ethernet port 3 is shown. Ws Ethernet port 3 found? No Go to step 11 on pge 286 Yes Go to step 3 3. (from step 2) Perform the following ctions from the front pnel of the ffected node:. Press nd relese the up or down button until Ethernet is shown. b. Press nd relese the up or down button until Ethernet port 3 is shown. c. Record if the second line of the disply shows Link offline, Link online, or Not configured. d. Press nd relese the up or down button until Ethernet port 4 is shown. e. Record if the second line of the disply shows Link offline, Link online, or Not configured. f. Go to step (from step 3) Wht ws the stte of the 10G Ethernet ports tht were seen in step 3? Both ports show Link online The 10G link is working now. Verify the repir by continuing with MAP 5700: Repir verifiction on pge 292. One or more ports show Link offline Go to step 5 on pge 286. One or more ports show Not configured For informtion bout the port configurtion, see the CLI commnd cfgportip description in the SAN Volume Controller Informtion Center for iscsi. For Fibre Chnnel over Ethernet informtion, see the CLI commnd lsportfc description in the SAN Volume Controller Informtion Chpter 10. Using the mintennce nlysis procedures 285

300 Center. This commnd provides connection properties nd sttus to help determine whether the Fibre Chnnel over Ethernet is prt of correctly configured VLAN. 5. (from step 4 on pge 285) Is the mber 10G Ethernet link LED off for the offline port? YES Go to step 6 NO The physicl link is opertionl. The problem might be with the system configurtion. See the configurtion topic iscsi configurtion detils nd Fibre Chnnel over Ethernet configurtion detils in the SAN Volume Controller Informtion Center. 6. (from step 5) Perform the following ctions:. Check tht the 10G Ethernet ports re connected to 10G Ethernet fbric. b. Check tht the 10G Ethernet fbric is configured. c. Pull out the smll form-fctor pluggble (SFP) trnsceiver nd plug it bck in. d. Pull out the opticl cble nd plug it bck in e. Clen contcts with smll blst of ir, if vilble. f. Go to step (from step 6) Did the mber link LED light? YES The physicl link is opertionl. Verify the repir by continuing with MAP 5700: Repir verifiction on pge 292. NO Go to step (from step 7) Swp the 10G SFPs in port 3 nd port 4, but keep the opticl cbles connected to the sme port. Is the mber link LED on the other port off now? YES Go to step 10. NO Go to step (from step 8) Swp the 10G Ethernet opticl cbles in port 3 nd port 4. Observe how the mber link LED chnges. Swp the cbles bck. Did the mber link LED on the other port go off? YES Check the 10G Ethernet opticl link nd fbric tht is connected to the port tht now hs the mber LED off. The problem is ssocited with the cble. The problem is either in the opticl cble or the Ethernet switch. Check tht the Ethernet switch shows tht the port is opertionl. If it does not show tht the port is opertionl, replce the opticl cble. Verify the repir by continuing with MAP 5700: Repir verifiction on pge 292. NO Go to step (from step 8) Perform the following ctions:. Replce the SFP tht now hs the mber link LED off. b. Verify the repir by continuing with MAP 5700: Repir verifiction on pge (from steps 1 on pge 285, 2 on pge 285, nd 9) Hve you lredy removed nd replced the 10G Ethernet dpter? YES Go to step 12 on pge 287. NO Perform the following ctions: 286 SAN Volume Controller: Troubleshooting Guide

301 MAP 5600: Fibre Chnnel. Remove nd replce the 10G Ethernet dpter. b. Verify the repir by continuing with MAP 5700: Repir verifiction on pge (from steps 11 on pge 286) Replce the 10G Ethernet dpter with new one.. Replce the 10G Ethernet dpter. b. Verify the repir by continuing with MAP 5700: Repir verifiction on pge 292. MAP 5600: Fibre Chnnel helps you to solve problems tht occur on the system Fibre Chnnel ports. Before you begin If you re not fmilir with these mintennce nlysis procedures (MAPs), first red Chpter 10, Using the mintennce nlysis procedures, on pge 265. This MAP pplies to ll system models. Be sure tht you know which model you re using before you strt this procedure. To determine which model you re working with, look for the lbel tht identifies the model type on the front of the node. You might be sent here for one of the following resons: v A problem occurred during the instlltion of system nd the Fibre Chnnel checks filed v Another MAP sent you here About this tsk Complete the following steps to solve problems tht re cused by the Fibre Chnnel ports. You cn use the technicin port on the system to ccess the service ssistnt. Procedure 1. Are you trying to resolve Fibre Chnnel port speed problem? NO Go to step 2. YES Go to step 11 on pge (from step 1) Disply the Fibre Chnnel port 1 sttus on the service ssistnt GUI. Is the service ssistnt GUI on the system showing Fibre Chnnel port-1 ctive? NO A Fibre Chnnel port is not working correctly. Check the port sttus on the service ssistnt GUI. v Inctive: The port is opertionl but cnnot ccess the Fibre Chnnel fbric. The Fibre Chnnel dpter is not configured correctly; the Fibre Chnnel smll form-fctor pluggble (SFP) trnsceiver filed; the Fibre Chnnel cble tht is either filed or is not instlled; or the device t the other end of the cble filed. Mke note of port-1. Go to step 7 on pge 290. Chpter 10. Using the mintennce nlysis procedures 287

302 YES v Filed: The port is not opertionl becuse of hrdwre filure. Mke note of port-1. Go to step 9 on pge 290. v Not instlled: This port is not instlled. Mke note of port-1. Go to step 10 on pge 290. Press nd relese the right button to disply Fibre Chnnel port-2.go to step (from step 2 on pge 287) Is the service ssistnt GUI on the system showing Fibre Chnnel port-2 ctive? NO A Fibre Chnnel port is not working correctly. Check the port sttus. v Inctive: The port is opertionl but cnnot ccess the Fibre Chnnel fbric. The Fibre Chnnel dpter is not configured correctly; the Fibre Chnnel smll form-fctor pluggble (SFP) trnsceiver filed; the Fibre Chnnel cble tht is either filed or is not instlled; or the device t the other end of the cble filed. Mke note of port-2. Go to step 7 on pge 290. v Filed: The port is not opertionl becuse of hrdwre filure. Mke note of port-2. Go to step 9 on pge 290. v Not instlled: This port is not instlled. Mke note of port-2. Go to step 10 on pge 290. YES Go to step (from step 3) Is the service ssistnt GUI on the system showing Fibre Chnnel port-3 ctive? NO A Fibre Chnnel port is not working correctly. Check the port sttus. v Inctive: The port is opertionl but cnnot ccess the Fibre Chnnel fbric. The Fibre Chnnel dpter is not configured correctly; the Fibre Chnnel smll form-fctor pluggble (SFP) trnsceiver filed; the Fibre Chnnel cble tht is either filed or is not instlled; or the device t the other end of the cble filed. Mke note of port-3. Go to step 7 on pge 290. v Filed: The port is not opertionl becuse of hrdwre filure. Mke note of port-3. Go to step 9 on pge 290. v Not instlled: This port is not instlled. Mke note of port-3. Go to step 10 on pge 290. YES Go to step (from step 4) Is the service ssistnt GUI on the system showing Fibre Chnnel port-4 ctive? NO A Fibre Chnnel port is not working correctly. Check the port sttus. v Inctive: The port is opertionl but cnnot ccess the Fibre Chnnel fbric. The Fibre Chnnel dpter is not configured correctly; the Fibre Chnnel smll form-fctor pluggble (SFP) trnsceiver filed; the Fibre Chnnel cble tht is either filed or is not instlled; or the device t the other end of the cble filed. Mke note of port-4. Go to step 7 on pge 290. v Filed: The port is not opertionl becuse of hrdwre filure. Mke note of port-4. Go to step 8 on pge SAN Volume Controller: Troubleshooting Guide

303 YES v Not instlled: This port is not instlled. Mke note of port-4. Go to step 10 on pge 290. If there re more thn four Fibre Chnnel ports on the node, repet step 5 on pge 288 for ech dditionl Fibre Chnnel port tht uses the service ssistnt. Go to step (from step 5 on pge 288) A previously reported fult with Fibre Chnnel port is no longer being shown. A problem with the SAN Fibre Chnnel fbric might be fixed or there might be n intermittent problem. Check with the customer to see whether ny Fibre Chnnel ports re disconnected or if ny component of the SAN Fibre Chnnel fbric filed nd ws recently fixed. Is the Fibre Chnnel port filure explined by the previous checks? NO There might be n intermittent Fibre Chnnel error.. Use the SAN problem determintion procedure to check for nd resolve ny Fibre Chnnel fbric connection problems. If you resolve problem, continue with MAP 5700: Repir verifiction on pge 292. b. Check whether similr Fibre Chnnel errors occurred recently on the sme port on this system node. If they hve, replce the Fibre Chnnel cble, unless it ws replced. c. Replce the Fibre Chnnel SFP trnsceiver, unless it ws replced. Note: System nodes re supported by both longwve SFP trnsceivers nd shortwve SFP trnsceivers. You must replce n SFP trnsceiver with the sme type of SFP trnsceiver. If the SFP trnsceiver to replce is longwve SFP trnsceiver, for exmple, you must provide suitble replcement. Removing the wrong SFP trnsceiver might result in loss of dt ccess. See the Removing nd replcing the Fibre Chnnel SFP trnsceiver on node documenttion to find out how to replce n SFP trnsceiver. d. Replce the Fibre Chnnel dpter ssembly tht is shown in Tble 76. Tble 76. Fibre Chnnel ssemblies Node SAN Volume Controller 2145-DH8 port 1, 2, 3, or 4 (slot 1 mndtory; the first FC dpter) SAN Volume Controller 2145-DH8 port 5, 6, 7, or 8 (slot 2 optionl; the second FC dpter) SAN Volume Controller 2145-DH8 port 9, 10, 11, or 12 (slot 5 optionl; the third FC dpter) Adpter ssembly Four-port Fibre Chnnel dpter Four-port Fibre Chnnel dpter Four-port Fibre Chnnel dpter e. Verify the repir by continuing with MAP 5700: Repir verifiction on pge 292. Chpter 10. Using the mintennce nlysis procedures 289

304 YES Verify the repir by continuing with MAP 5700: Repir verifiction on pge (from steps 2 on pge 287, 3 on pge 288, 4 on pge 288, nd 5 on pge 288) The port noted on the system is showing sttus of inctive. For certin models, this inctive sttus might occur when the Fibre Chnnel speed is not set correctly. 8. (from step 7) The noted port on the system displys sttus of inctive. If the noted port still displys sttus of inctive, replce the prts tht re ssocited with the noted port until the problem is fixed in the following order:. Fibre Chnnel cbles from the system to Fibre Chnnel network. b. Fulty Fibre Chnnel fbric connections, prticulrly the SFP trnsceiver t the Fibre Chnnel switch. Use the SAN problem determintion procedure to resolve ny Fibre Chnnel fbric connection problem. c. System Fibre Chnnel SFP trnsceiver. Note: System nodes re supported by both longwve SFPs nd shortwve SFPs. You must replce n SFP with the sme type of SFP trnsceiver tht you re replcing. If the SFP trnsceiver to replce is longwve SFP trnsceiver, for exmple, you must provide suitble replcement. Removing the wrong SFP trnsceiver might result in loss of dt ccess. See the Removing nd replcing the Fibre Chnnel SFP trnsceiver on system node documenttion to find out how to replce n SFP trnsceiver. d. Replce the Fibre Chnnel dpter ssembly s shown in Tble 76 on pge 289. e. Verify the repir by continuing with MAP 5700: Repir verifiction on pge (from steps 2 on pge 287, 3 on pge 288, 4 on pge 288, nd 5 on pge 288) The noted port on the system displys sttus of filed. Verify tht the Fibre Chnnel cbles tht connect the system nodes to the switches re securely connected. Replce the prts tht re ssocited with the noted port until the problem is fixed in the following order:. Fibre Chnnel SFP trnsceiver. Note: system nodes re supported by both longwve SFP trnsceivers nd shortwve SFP trnsceivers. You must replce n SFP trnsceiver with the sme type of SFP trnsceiver. If the SFP trnsceiver to replce is longwve SFP trnsceiver, for exmple, you must provide suitble replcement. Removing the wrong SFP trnsceiver might result in loss of dt ccess. See the Removing nd replcing the Fibre Chnnel SFP trnsceiver on node documenttion to find out how to replce n SFP trnsceiver. b. Replce the Fibre Chnnel dpter ssembly s shown in Tble 76 on pge 289. c. Verify the repir by continuing with MAP 5700: Repir verifiction on pge (from steps 2 on pge 287, 3 on pge 288, 4 on pge 288, nd 5 on pge 288) The noted port on the system displys sttus of not instlled. If you replced the Fibre Chnnel dpter, mke sure tht it is instlled correctly. If you replced ny other system bord components, mke sure tht the Fibre Chnnel dpter ws not disturbed. 290 SAN Volume Controller: Troubleshooting Guide

305 Is the Fibre Chnnel dpter filure explined by the previous checks? NO. Replce the Fibre Chnnel dpter ssembly s shown in Tble 76 on pge 289. b. If the problem is not fixed, replce the Fibre Chnnel connection hrdwre in the order tht is shown in Tble 77. Tble 77. System Fibre Chnnel dpter connection hrdwre Node Adpter connection hrdwre SAN Volume Controller 2145-DH8 port PCI Express Riser crd ssembly 1 2. System bord SAN Volume Controller 2145-DH8 port PCI Express Riser crd ssembly 2 2. System bord YES c. Verify the repir by continuing with MAP 5700: Repir verifiction on pge 292. Verify the repir by continuing with MAP 5700: Repir verifiction on pge (from step 1 on pge 287) If the operting speed is lower thn the operting speed tht is supported by the switch, high number of link errors re being detected. To disply the current speed of the link, see support/knowledgecenter/stpvgu_7.6.0/ com.ibm.storge.svc.console.760.doc/svc_svcdetfibrenetspeed_23eef.html Is the port operting t lower thn the expected speed? NO YES Repet the check with the other Fibre Chnnel ports until the filing port is locted. If no filing port is locted, the problem no longer exists. Verify the repir by continuing with MAP 5700: Repir verifiction on pge 292. Perform the following steps:. Check the routing of the Fibre Chnnel cble to ensure tht no dmge exists nd tht the cble route contins no tight bends (no less thn 3-inch rdius). Either reroute or replce the Fibre Chnnel cble. b. Remove the Fibre Chnnel cble for 2 seconds nd then reinsert it to force the Fibre Chnnel dpter to renegotite its operting speed. c. Recheck the speed of the Fibre Chnnel port. If it is now correct, the problem is resolved. Otherwise, the problem might be cused by one of the following conditions: v Four-port Fibre Chnnel HBA v System SFP trnsceiver v Fibre Chnnel switch gigbit interfce converter (GBIC) or SFP trnsceiver v Fibre Chnnel switch Recheck the speed fter you chnge ny component until the problem is resolved nd then verify the repir by continuing with MAP 5700: Repir verifiction on pge 292. Chpter 10. Using the mintennce nlysis procedures 291

306 MAP 5700: Repir verifiction MAP 5700: Repir verifiction helps you to verify tht field-replceble units (FRUs) tht you exchnge for new FRUs, or repir ctions tht re completed solve ll the problems on the SAN Volume Controller. Before you begin If you re not fmilir with these mintennce nlysis procedures (MAPs), first red Chpter 10, Using the mintennce nlysis procedures, on pge 265. You might hve been sent here becuse you performed repir nd wnt to confirm tht no other problems exist on the mchine. Procedure 1. Are the Power LEDs on ll the nodes on? For more informtion bout this LED, see Power LED on pge 27. NO Go to MAP 5000: Strt on pge 265. YES Go to step (from step 1) Are ll the nodes displying Cluster: or is the node sttus LED on? NO Go to MAP 5000: Strt on pge 265. YES Go to step (from step 2) Using the SAN Volume Controller ppliction for the system you repir, check the sttus of ll configured mnged disks (MDisks). Do ll MDisks hve sttus of online? NO If ny MDisks hve sttus of offline, repir the MDisks. Use the problem determintion procedure for the disk controller to repir the MDisk fults before you return to this MAP. If ny MDisks hve sttus of degrded pths or degrded ports, repir ny storge re network (SAN) nd MDisk fults before you return to this MAP. If ny MDisks show sttus of excluded, include MDisks before you return to this MAP. Go to MAP 5000: Strt on pge 265. YES Go to step (from step 3) Using the SAN Volume Controller ppliction on the repired system, check the sttus of ll configured volumes. Do ll volumes hve sttus of online? NO Go to step 5. YES Go to step 6 on pge (from step 4) Following repir of the SAN Volume Controller, number of volumes re showing sttus of offline. Volumes re held offline if SAN Volume Controller cnnot confirm the integrity of the dt. The volumes might be the trget of copy tht did not complete, or cche write dt tht ws not written bck to disk might be lost. Determine why the volume is offline. If the volume ws the 292 SAN Volume Controller: Troubleshooting Guide

307 MAP 5800: Light pth trget of copy tht did not complete, you cn strt the copy gin. Otherwise, write dt might not be written to the disk, so its stte cnnot be verified. Your site procedures determine how dt is restored to known stte. To bring the volume online, you must move ll the offline disks to the recovery I/O group nd then move them bck to n ctive I/O group. Go to MAP 5000: Strt on pge (from step 4 on pge 292) You successfully repir the SAN Volume Controller. MAP 5800: Light pth helps you to solve hrdwre problems tht prevent the SAN Volume Controller 2145-DH8 from booting. Before you begin If you re not fmilir with these mintennce nlysis procedures (MAPs), first red Chpter 10, Using the mintennce nlysis procedures, on pge 265. You might be sent here becuse of the following situtions: v The Error LED on the opertor-informtion pnel is on or flshing. v Another MAP sent you here: Light pth for SAN Volume Controller 2145-DH8 Light pth for SAN Volume Controller 2145-DH8 Light pth dignostics is system of LEDs on top of the opertor-informtion pnel of the SAN Volume Controller 2145-DH8 node, which leds you to the filed component. About this tsk When n error occurs, LEDs re lit long the front of the opertor-informtion pnel, the light pth dignostics pnel, then on the filed component. By viewing the LEDs in prticulr order, you cn often identify the source of the error. LEDs tht re lit to indicte n error, remin lit when the server is turned off, if the node is connected to n operting power supply. Ensure tht the node is turned on, nd then resolve ny hrdwre errors tht re indicted by the Error LED nd light pth LEDs: Procedure 1. Is the System error LED 7, shown in Figure 45 on pge 294, on the SAN Volume Controller 2145-DH8 opertor-informtion pnel on or flshing? Chpter 10. Using the mintennce nlysis procedures 293

308 ifs00064 Figure 45. SAN Volume Controller 2145-DH8 opertor-informtion pnel 1 Power control button nd LED. 2 Ethernet LED. 3 Loctor button nd LED. 4 Relese ltch. 5 Ethernet ctivity LEDs. 6 Check log LED. 7 System error LED. NO Ressess your symptoms nd return to MAP 5000: Strt on pge 265. YES Go to step (from step 1 on pge 293) Press the relese ltch, s shown in Figure 46, nd open the light pth dignostics pnel, which is shown in Figure 47 on pge 295. Opertor informtion pnel Light pth dignostics LEDs Relese ltch Figure 46. Press the relese ltch Are one or more LEDs on the light pth dignostics pnel on or flshing? 294 SAN Volume Controller: Troubleshooting Guide

309 Checkpoint Code Remind Reset Light Pth Dignostics Figure 47. SAN Volume Controller 2145-DH8 light pth dignostics pnel NO YES Verify tht the opertor-informtion pnel cble is correctly seted t both ends. If the error LED is still illuminted but no LEDs re illuminted on the light pth dignostics pnel, replce prts in the following sequence:. Opertor-informtion pnel b. System bord Verify the repir by continuing with MAP 5700: Repir verifiction on pge 292. See Tble 78 on pge 296 nd complete the ction tht is specified for the specific light-pth-dignostics LEDs. Then, go to step 3 on pge 300. Some ctions require tht you observe the stte of LEDs on the system bord. Figure 48 on pge 296 shows the loction of the system bord LEDs. The fn LEDs re locted next to ech FAN. To view the LEDs, complete the following ctions:. Before you turn off the node, ensure tht its dt is mirrored nd synchronized. b. Identify nd lbel ll the cbles tht re ttched to the node so tht they cn be replced in the sme port. Remove the node from the rck nd plce it on flt, sttic-protective surfce. For more informtion, see Removing the node from rck. c. Remove the top cover. d. See Tble 78 on pge 296 nd complete the ction tht is specified for the specific light-pth-dignostics LEDs. Then, go to step 3 on pge 300. Chpter 10. Using the mintennce nlysis procedures 295

310 System Error LED Loctor LED Power LED Enclosure mngement hertbet LED Imm2 hertbet LED Stndby power LED 10G Ethernet crd error LED Bttery error LED DIMM error LED (under the ltches) DIMM 1-6 error LED (under the ltches) Microprocessor 2 error LED Microprocessor 1 error LED Fn 4 error LED Fn3 error LED DIMM 7-18 Fn2 error LED error LED (under the ltches) System bord error LED Fn1 error LED Figure 48. SAN Volume Controller 2145-DH8 system bord LEDs. Tble 78. Dignostics pnel LEDs LED Description Action The Error log or Check log LED opertorinformtion pnel Systemerror LED opertorinformtion pnel An error occurs nd cnnot be isolted without completing certin procedures. An error occurred. 1. Plug in the VGA screen nd the USB keybord. 2. Check the IMM2 system event log nd the system-error log for informtion bout the error. 3. Sve the log if necessry nd cler the log fterwrd. 1. Check the light-pth-dignostics LEDs nd follow the instructions. 2. Check the IMM2 system event log nd the system-error log for informtion bout the error. 3. Sve the log if necessry nd cler the log fterwrd. 296 SAN Volume Controller: Troubleshooting Guide

311 Tble 78. Dignostics pnel LEDs (continued) LED Description Action PS OVER SPEC PCI When only the PS LED is lit, power supply filed. PS + CONFIG When both the PS nd CONFIG LEDs re lit, the power supply configurtion is not vlid. The system consumption reches the power supply over-current protection point or the power supplies re dmged. An error occurred on PCI bus or on the system bord. Another LED is lit next to filing PCI slot. The system might detect power supply error. Complete the following steps to correct the problem: 1. Check the power-supply with lit yellow LED. 2. Mke sure tht the power supplies re seted correctly nd plugged in good AC outlet. 3. Remove power supply to isolte the filed power supply. 4. Mke sure tht both power supplies instlled in the server re of the sme AC input voltge. 5. Replce the filed power supply. If the PS LED nd the CONFIG LED re lit, the system logs n invlid power configurtion error. Mke sure tht both power supplies instlled in the node re of the sme rting or wttge. 1. If the power ril (A, B, C, D, E, F, G, nd H) error ws not detected, complete the following steps:. Use the IBM Systems Energy Estimtor to determine the current system power consumption. For more informtion, go to the following website: support/tools/estimtor/energy/ index.html b. Replce the filed power supply. 2. If the power ril (A, B, C, D, E, F, G, nd H) error ws lso detected, follow ctions tht re listed in MAP 5040: Power. 1. Check the riser-crd LEDs, the ServeRAID error LED, nd the dul-port network dpter error LED to identify the component tht cused the error. 2. Check the system-error log for informtion bout the error. 3. If you cnnot isolte the filing component by using the LEDs nd the informtion in the system-error log, remove one component t time. Then, restrt the server fter ech component is removed. 4. Replce the following components, in the order tht is shown, restrting the server ech time: v v v v PCI riser crds ServeRAID dpter Network dpter (Trined technicin only) System bord. 5. If the filure remins, contct your IBM service representtive. Chpter 10. Using the mintennce nlysis procedures 297

312 Tble 78. Dignostics pnel LEDs (continued) LED Description Action NMI CONFIG LINK A nonmskble interrupt occurred, or the NMI button ws pressed. CONFIG + PS An invlid power configurtion error occurred. CONFIG + CPU A hrdwre configurtion error occurred. CONFIG + MEM A hrdwre configurtion error occurred. CONFIG + PCI A hrdwre configurtion error occurred. CONFIG + HDD A disk drive error occurred. Reserved. 1. Check the system-error log for informtion bout the error. 2. Restrt the server. If the CONFIG LED nd the PS LED re lit, the system logs n invlid power configurtion error. Mke sure tht both power supplies instlled in the server re of the sme rting or wttge. If the CONFIG LED nd the CPU LED re lit, complete the following steps to correct the problem: 1. Check the microprocessors tht were instlled to mke sure tht they re comptible with ech other. 2. (Trined technicin only) Replce the incomptible microprocessor. 3. Check the system-error logs for informtion bout the error. Replce ny component tht is identified in the error log. If the CONFIG LED nd the MEM LED re lit, check the system-event log in the Setup utility or IMM2 error messges. If the CONFIG LED nd the PCI LED re lit, check the system-error logs for informtion bout the error. Replce ny component tht is identified in the error log. If the CONFIG LED nd the HDD LED re lit, check the system-error logs for informtion bout the error. Replce ny component tht is identified in the error log. 298 SAN Volume Controller: Troubleshooting Guide

313 Tble 78. Dignostics pnel LEDs (continued) LED Description Action CPU MEM TEMP When only the CPU LED is lit, microprocessor filed. When both the CPU nd CONFIG LEDs re lit, the microprocessor configurtion is invlid. When only the MEM LED is lit, memory error occurs. MEM + CONFIG When both the MEM nd CONFIG LEDs re lit, the memory configurtion is not vlid. The system or the system component temperture exceeded threshold level. A filing fn cn cuse the TEMP LED to be lit. 1. If the CONFIG LED is not lit, microprocessor filure occurs, complete the following steps:. (Trined technicin only) Mke sure tht the filing microprocessor nd its het sink, which re indicted by lit LED on the system bord, re instlled correctly. b. (Trined technicin only) Replce the filing microprocessor. c. For more informtion, contct your IBM service representtive. 2. If the CONFIG LED nd the CPU LED re lit, the system logs n invlid microprocessor configurtion error. Complete the following steps to correct the problem:. Check recently instlled microprocessors to ensure tht they re comptible with ech other. b. (Trined technicin only) Replce ny incomptible microprocessor. c. Check the system-error logs for informtion bout the error. Replce ny component tht is identified in the error log. Note: Note: Ech time tht you instll or remove DIMM, you must disconnect the node from the power source; then, wit 10 seconds before you restrt the server. If the CONFIG LED is not lit, the system might detect memory error. Complete the following steps to correct the problem: 1. Updte the node firmwre. 2. Reset or swp the DIMMs with lit LED. 3. Check the system-event log in the Setup utility or IMM error messges. 4. Replce the filing DIMM. If the MEM LED nd the CONFIG LED re lit, check the system-event log in the Setup utility or IMM2 error messges. 1. Mke sure tht the het sink is seted correctly. 2. Determine whether fn filed nd replce the fn if necessry. 3. Mke sure tht the room temperture is not too high. See the environment requirements for the server temperture informtion. 4. Mke sure tht the ir vents re not blocked. 5. Mke sure tht the het sink or the fn on the dpter, or ny other network dpter is seted correctly. If the fn filed, replce it. 6. For more informtion, contct your IBM service representtive. Chpter 10. Using the mintennce nlysis procedures 299

314 Tble 78. Dignostics pnel LEDs (continued) LED Description Action FAN BOARD A fn is either filed, operting too slowly, or is removed. The TEMP LED might lso be lit. An error occurred on the system bord or the system bttery. 1. Check whether your node is instlled with the dul-port network dpter. If yes, mke sure tht your node compiles with the configurtion with four fns instlled. 2. Reset the filing fn, which is indicted by lit LED ner the fn connector on the system bord. 3. Replce the filing fn. 1. Check the LEDs on the system bord to identify the component tht cused the error. The BOARD LED cn be lit due to ny of the following resons: v v Bttery (Trined technicin only) System bord 2. Check the system-error log for informtion bout the error. 3. Replce the filing component. HDD A hrd disk drive tht is filed or is missing. 1. Check the LEDs on the hrd disk drives for the drive with lit sttus LED nd reset the hrd disk drive. 2. Reset the hrd disk drive bckplne. 3. If the error remins, replce the following components one t time, in the order tht is listed, restrting the server fter ech:. Replce the hrd disk drive. b. Replce the hrd disk drive bckplne. 4. If the problem remins, contct your IBM service representtive. 3. Continue with MAP 5700: Repir verifiction on pge 292 to verify the correct opertion. 300 SAN Volume Controller: Troubleshooting Guide

315 Chpter 11. iscsi performnce nlysis nd tuning This procedure provides solution for Internet Smll Computer Systems Interfce (iscsi) host performnce problems while connected to system nd its connectivity to the network switch. About this tsk Some of the ttributes nd host prmeters tht might ffect iscsi performnce: v Trnsmission Control Protocol (TCP) Delyed ACK v Ethernet jumbo frme v Network bottleneck or oversubscription v iscsi session login blnce v Priority flow control (PFC) setting nd bndwidth lloction for iscsi on the network Procedure 1. Disble the TCP delyed cknowledgment feture. To disble this feture, refer to OS/pltform documenttion. v v VMWre: Windows: The primry signture of this issue: red performnce is significntly lower thn write performnce. Trnsmission Control Protocol (TCP) delyed cknowledgment is technique tht is used by some implementtions of the TCP to improve network performnce. However, in this scenrio where the number of outstnding I/O is 1, the technique cn significntly reduce I/O performnce. In essence, severl ACK responses cn be combined into single response, reducing protocol overhed. As described in RFC 1122, host cn dely sending n ACK response by up to 500 ms. Additionlly, with strem of full-sized incoming segments, ACK responses must be sent for every second segment. Importnt: The host must be rebooted for these settings to tke effect. A few pltforms (for exmple, stndrd Linux distributions) do not provide wy to disble this feture. However, the issue ws resolved with the version 7.1 relese, nd no host configurtion chnges re required to mnge TcpDelyedAck behvior. 2. Enble jumbo frme for iscsi. Jumbo frmes re Ethernet frmes with size in excess of 1500 bytes. The mximum trnsmission unit (MTU) prmeter is used to mesure the size of jumbo frmes. The system supports 9000-bytes MTU. Refer to the CLI commnd cfgportip to enble jumbo frme. This commnd is disruptive s the link flips nd the I/O opertion through tht port puses. The network must support jumbo frmes end-to-end to be effective. Send ping pcket to be delivered without frgmenttion to verify tht the network supports jumbo frmes. For exmple: v Windows: Copyright IBM Corp. 2003,

316 v v ping -t <iscsi trget ip> -S <iscsi inititor ip> -f -l <new mtu size - pcket overhed (usully 36, might differ)> The following commnd is n exmple of commnd tht is used to check whether 9000-bytes MTU is set correctly on Windows 7 system: ping -t -S f -l 8964 The following output is n exmple of successful reply: : bytes=8964 time=1ms TTL=52 Linux: ping -l <source iscsi inittior ip> -s <new mtu size> -M do <iscsi trget ip> ESXi: ping <iscsi trget ip> -I <source iscsi inititor ip> -s <new mtu size - 28> -d 3. Verify the switch's port sttistic where inititor/trget ports re connected to mke sure tht pcket drops re not high. Review network rchitecture to void ny bottlenecks nd oversubscription. The network needs to be blnced to void ny pcket drop; pcket drop significntly reduces storge performnce. Involve networking support to fix ny such issues. 4. Optimize nd utilize ll iscsi ports. To optimize system resource utiliztion, ll iscsi ports must be used. v v v Ech port is ssigned to one CPU, nd by blncing the login, one cn mximize CPU utiliztion nd chieve better performnce. Idelly, configure subnets equl to the number of iscsi ports on the system node. Configure ech port of node with n IP on different subnet nd keep it the sme for other nodes. The following exmple displys n idel configurtion: Node 1 Port 1: Port 2: Port 3: Node 2: Port 1: Port 2: Port 3: Avoid situtions where 50 hosts re logged in to port 1 nd only five hosts re logged in to port 2. Use proper subnetting to chieve blnce between the number of sessions nd redundncy. 5. Troubleshoot problems with PFC settings. You do not need to enble PFC on the system. system reds the dt center bridging exchnge (DCBx) pcket nd enbles PFC for iscsi utomticlly if it is enbled on the switch. In the lsportip commnd output, the fields lossless_iscsi nd lossless_iscsi6 show [on/off] depending on whether PFC is enbled or not for iscsi on the system. If the fields lossless_iscsi nd lossless_iscsi6 re showing off, it might be due to one of the following resons:. VLAN is not set for tht IP. Verify the following checks: v For IP ddress type IPv4, check the vln field in the lsportip output. It must not be blnk. v For IP ddress type IPv6, check the vln_6 field in the lsportip output. It must not be blnk. 302 SAN Volume Controller: Troubleshooting Guide

317 v If the vln nd vln_6 fields re blnk, use Configuring VLAN for iscsi to set the VLAN for the IP type. b. Host flg is not set for tht IP. Verify the following checks: v For IP ddress type IPv4, check the host field in the lsportip output. It must be yes. v For IP ddress type IPv6, check the host_6 field in the lsportip output. It must be yes. v If the host nd host_6 fields re not yes, use the cfgportip CLI commnd to set the host flg for the IP type. c. PFC is not properly set on the switch. If the VLAN is properly set, nd the host flg is lso set, but the lossless_iscsi or lossless_iscsi6 field is still showing off, some switch settings might be missing or incorrect. Verify the following settings in the switch: v Priority tg is set for iscsi trffic. v PFC is enbled for priority tg tht is ssigned to iscsi CoS. v DCBx is enbled on the switch. Check the pproprite documenttion: v Consult the documenttion for enbling PFC on your specific switch. v Consult the documenttion for enbling PFC on Red Ht Enterprise Linux (RHEL) nd Windows hosts specific to your configurtion. 6. Ensure tht proper bndwidth is given to iscsi on the network. You cn divide the bndwidth mong the vrious types of trffic. It is importnt to ssign proper bndwidth for good performnce. To ssign bndwidth for iscsi trffic, you need to first enble the priority flow control for iscsi. Chpter 11. iscsi performnce nlysis nd tuning 303

318 304 SAN Volume Controller: Troubleshooting Guide

319 Appendix A. Accessibility fetures for the system Accessibility fetures help users who hve disbility, such s restricted mobility or limited vision, to use informtion technology products successfully. Accessibility fetures These re the mjor ccessibility fetures for the system: v You cn use screen-reder softwre nd digitl speech synthesizer to her wht is displyed on the screen. HTML documents re tested by using JAWS version v This product uses stndrd Windows nvigtion keys. v Interfces re commonly used by screen reders. v Keys re discernible by touch, but do not ctivte just by touching them. v Industry-stndrd devices, ports, nd connectors. v You cn ttch lterntive input nd output devices. The system online documenttion nd its relted publictions re ccessibility-enbled. The ccessibility fetures of the online documenttion re described in Viewing informtion in the informtion center Keybord nvigtion You cn use keys or key combintions for opertions nd to initite menu ctions tht cn lso be done through mouse ctions. You cn go to the system online documenttion from the keybord by using the keybord shortcuts for your browser or screen-reder softwre. See your browser or screen-reder softwre Help for list of keybord shortcuts tht it supports. IBM nd ccessibility See the IBM Humn Ability nd Accessibility Center for more informtion bout the commitment tht IBM hs to ccessibility. Copyright IBM Corp. 2003,

320 306 SAN Volume Controller: Troubleshooting Guide

321 Appendix B. Where to find the Sttement of Limited Wrrnty The Sttement of Limited Wrrnty is vilble in both hrdcopy formt nd in the SAN Volume Controller IBM Knowledge Center. The Sttement of Limited Wrrnty is included (in hrdcopy form) with your product. It cn lso be ordered from IBM (see Tble 2 on pge x for the prt number). Copyright IBM Corp. 2003,

322 308 SAN Volume Controller: Troubleshooting Guide

323 Notices This informtion ws developed for products nd services offered in the US. This mteril might be vilble from IBM in other lnguges. However, you my be required to own copy of the product or product version in tht lnguge in order to ccess it. IBM my not offer the products, services, or fetures discussed in this document in other countries. Consult your locl IBM representtive for informtion on the products nd services currently vilble in your re. Any reference to n IBM product, progrm, or service is not intended to stte or imply tht only tht IBM product, progrm, or service my be used. Any functionlly equivlent product, progrm, or service tht does not infringe ny IBM intellectul property right my be used insted. However, it is the user's responsibility to evlute nd verify the opertion of ny non-ibm product, progrm, or service. IBM my hve ptents or pending ptent pplictions covering subject mtter described in this document. The furnishing of this document does not grnt you ny license to these ptents. You cn send license inquiries, in writing, to: IBM Director of Licensing IBM Corportion North Cstle Drive Armonk, NY U.S.A. For license inquiries regrding double-byte chrcter set (DBCS) informtion, contct the IBM Intellectul Property Deprtment in your country or send inquiries, in writing, to: Intellectul Property Licensing Legl nd Intellectul Property Lw IBM Jpn, Ltd , Nihonbshi-Hkozkicho, Chuo-ku Tokyo , Jpn INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some jurisdictions do not llow disclimer of express or implied wrrnties in certin trnsctions, therefore, this sttement my not pply to you. This informtion could include technicl inccurcies or typogrphicl errors. Chnges re periodiclly mde to the informtion herein; these chnges will be incorported in new editions of the publiction. IBM my mke improvements nd/or chnges in the product(s) nd/or the progrm(s) described in this publiction t ny time without notice. Any references in this informtion to non-ibm websites re provided for convenience only nd do not in ny mnner serve s n endorsement of those Copyright IBM Corp. 2003,

324 websites. The mterils t those websites re not prt of the mterils for this IBM product nd use of those websites is t your own risk. IBM my use or distribute ny of the informtion you provide in ny wy it believes pproprite without incurring ny obligtion to you. Licensees of this progrm who wish to hve informtion bout it for the purpose of enbling: (i) the exchnge of informtion between independently creted progrms nd other progrms (including this one) nd (ii) the mutul use of the informtion which hs been exchnged, should contct: IBM Director of Licensing IBM Corportion North Cstle Drive, MD-NC119 Armonk, NY US Such informtion my be vilble, subject to pproprite terms nd conditions, including in some cses, pyment of fee. The licensed progrm described in this document nd ll licensed mteril vilble for it re provided by IBM under terms of the IBM Customer Agreement, IBM Interntionl Progrm License Agreement or ny equivlent greement between us. The performnce dt discussed herein is presented s derived under specific operting conditions. Actul results my vry. Informtion concerning non-ibm products ws obtined from the suppliers of those products, their published nnouncements or other publicly vilble sources. IBM hs not tested those products nd cnnot confirm the ccurcy of performnce, comptibility or ny other clims relted to non-ibm products. Questions on the cpbilities of non-ibm products should be ddressed to the suppliers of those products. Sttements regrding IBM's future direction or intent re subject to chnge or withdrwl without notice, nd represent gols nd objectives only. All IBM prices shown re IBM's suggested retil prices, re current nd re subject to chnge without notice. Deler prices my vry. This informtion is for plnning purposes only. The informtion herein is subject to chnge before the products described become vilble. This informtion contins exmples of dt nd reports used in dily business opertions. To illustrte them s completely s possible, the exmples include the nmes of individuls, compnies, brnds, nd products. All of these nmes re fictitious nd ny similrity to the nmes nd ddresses used by n ctul business enterprise is entirely coincidentl. COPYRIGHT LICENSE: This informtion contins smple ppliction progrms in source lnguge, which illustrte progrmming techniques on vrious operting pltforms. You my copy, modify, nd distribute these smple progrms in ny form without pyment to IBM, for the purposes of developing, using, mrketing or distributing ppliction 310 SAN Volume Controller: Troubleshooting Guide

325 progrms conforming to the ppliction progrmming interfce for the operting pltform for which the smple progrms re written. These exmples hve not been thoroughly tested under ll conditions. IBM, therefore, cnnot gurntee or imply relibility, servicebility, or function of these progrms. The smple progrms re provided "AS IS", without wrrnty of ny kind. IBM shll not be lible for ny dmges rising out of your use of the smple progrms. If you re viewing this informtion softcopy, the photogrphs nd color illustrtions my not pper. Trdemrks IBM, the IBM logo, nd ibm.com re trdemrks or registered trdemrks of Interntionl Business Mchines Corp., registered in mny jurisdictions worldwide. Other product nd service nmes might be trdemrks of IBM or other compnies. A current list of IBM trdemrks is vilble on the web t Copyright nd trdemrk informtion t Product support sttement Homologtion sttement Adobe, the Adobe logo, PostScript, nd the PostScript logo re either registered trdemrks or trdemrks of Adobe Systems Incorported in the United Sttes, nd/or other countries. Linux nd the Linux logo is registered trdemrk of Linus Torvlds in the United Sttes, other countries, or both. Microsoft, Windows, nd the Windows logo re trdemrks of Microsoft Corportion in the United Sttes, other countries, or both. Other product nd service nmes might be trdemrks of IBM or other compnies. If you hve n operting system, Hypervisor, pltform or host ttchment crd in your environment, check the IBM System Storge Interopertion Center (SSIC) to confirm the support sttus for this product. SSIC cn be found t interoperbility.wss. This product my not be certified in your country for connection by ny mens whtsoever to interfces of public telecommunictions networks. Further certifiction my be required by lw prior to mking ny such connection. Contct n IBM representtive or reseller for ny questions. Electromgnetic comptibility notices The following Clss A sttements pply to IBM products nd their fetures unless designted s electromgnetic comptibility (EMC) Clss B in the feture informtion. When ttching monitor to the equipment, you must use the designted monitor cble nd ny interference suppression devices tht re supplied with the monitor. Notices 311

326 Cnd Notice CAN ICES-3 (A)/NMB-3(A) Europen Community nd Morocco Notice This product is in conformity with the protection requirements of Directive 2014/30/EU of the Europen Prliment nd of the Council on the hrmoniztion of the lws of the Member Sttes relting to electromgnetic comptibility. IBM cnnot ccept responsibility for ny filure to stisfy the protection requirements resulting from non-recommended modifiction of the product, including the fitting of non-ibm option crds. This product my cuse interference if used in residentil res. Such use must be voided unless the user tkes specil mesures to reduce electromgnetic emissions to prevent interference to the reception of rdio nd television brodcsts. Wrning: This equipment is complint with Clss A of CISPR 32. In residentil environment this equipment my cuse rdio interference. Germny Notice Deutschsprchiger EU Hinweis: Hinweis für Geräte der Klsse A EU-Richtlinie zur Elektromgnetischen Verträglichkeit Dieses Produkt entspricht den Schutznforderungen der EU-Richtlinie 2014/30/EU zur Angleichung der Rechtsvorschriften über die elektromgnetische Verträglichkeit in den EU-Mitgliedssttenund hält die Grenzwerte der EN Klsse A ein. Um dieses sicherzustellen, sind die Geräte wie in den Hndbüchern beschrieben zu instllieren und zu betreiben. Des Weiteren dürfen uch nur von der IBM empfohlene Kbel ngeschlossen werden. IBM übernimmt keine Verntwortung für die Einhltung der Schutznforderungen, wenn ds Produkt ohne Zustimmung von IBM verändert bzw. wenn Erweiterungskomponenten von Fremdherstellern ohne Empfehlung von IBM gesteckt/eingebut werden. EN Klsse A Geräte müssen mit folgendem Wrnhinweis versehen werden: Wrnung: Dieses ist eine Einrichtung der Klsse A. Diese Einrichtung knn im Wohnbereich Funk-Störungen verurschen; in diesem Fll knn vom Betreiber verlngt werden, ngemessene Mßnhmen zu ergreifen und dfür ufzukommen. Deutschlnd: Einhltung des Gesetzes über die elektromgnetische Verträglichkeit von Geräten Dieses Produkt entspricht dem Gesetz über die elektromgnetische Verträglichkeit von Geräten (EMVG). Dies ist die Umsetzung der EU-Richtlinie 2014/30/EU in der Bundesrepublik Deutschlnd. Zulssungsbescheinigung lut dem Deutschen Gesetz über die elektromgnetische Verträglichkeit von Geräten (EMVG) (bzw. der EMC Richtlinie 2014/30/EU) für Geräte der Klsse A Dieses Gerät ist berechtigt, in Übereinstimmung mit dem Deutschen EMVG ds EG-Konformitätszeichen - CE - zu führen. Verntwortlich für die Einhltung der EMV-Vorschriften ist der Hersteller: 312 SAN Volume Controller: Troubleshooting Guide

327 Interntionl Business Mchines Corp. New Orchrd Rod Armonk, New York Tel: Der verntwortliche Ansprechprtner des Herstellers in der EU ist: IBM Deutschlnd GmbH Technicl Reltions Europe, Abteilung M456 IBM-Allee 1, Ehningen, Germny Tel: e-mil: Generelle Informtionen: Ds Gerät erfüllt die Schutznforderungen nch EN und EN Klsse A. Jpn Electronics nd Informtion Technology Industries Assocition (JEITA) Notice This sttement pplies to products less thn or equl to 20 A per phse. This sttement pplies to products greter thn 20 A, single phse. This sttement pplies to products greter thn 20 A per phse, three-phse. Notices 313

328 Jpn Voluntry Control Council for Interference (VCCI) Notice Kore Notice. People's Republic of Chin Notice Russi Notice rusemi 314 SAN Volume Controller: Troubleshooting Guide

Troubleshooting Guide

Troubleshooting Guide IBM System Storge SAN Volume Controller Troubleshooting Guide GC27-2284-06 Note Before using this informtion nd the product it supports, red the informtion in Notices on pge 351. This edition pplies to

More information

McAfee Network Security Platform

McAfee Network Security Platform Mnger Applince Quick Strt Guide Revision B McAfee Network Security Pltform This guide is high-level description of how to instll nd configure the Mnger Applince. For more detiled instlltion informtion,

More information

Maintenance Guide PN 38L6404, EC M13180

Maintenance Guide PN 38L6404, EC M13180 IBMTS7610ndTS7620ProtecTIER Dedupliction Applince Express Mintennce Guide PN 38L6404, EC M13180 V3.3.6 GA32-2232-04 IBMTS7610ndTS7620ProtecTIER Dedupliction Applince Express Mintennce Guide PN 38L6404,

More information

EasyMP Multi PC Projection Operation Guide

EasyMP Multi PC Projection Operation Guide EsyMP Multi PC Projection Opertion Guide Contents 2 Introduction to EsyMP Multi PC Projection 5 EsyMP Multi PC Projection Fetures... 6 Connection to Vrious Devices... 6 Four-Pnel Disply... 6 Chnge Presenters

More information

Epson Projector Content Manager Operation Guide

Epson Projector Content Manager Operation Guide Epson Projector Content Mnger Opertion Guide Contents 2 Introduction to the Epson Projector Content Mnger Softwre 3 Epson Projector Content Mnger Fetures... 4 Setting Up the Softwre for the First Time

More information

EasyMP Network Projection Operation Guide

EasyMP Network Projection Operation Guide EsyMP Network Projection Opertion Guide Contents 2 Introduction to EsyMP Network Projection EsyMP Network Projection Fetures... 5 Disply Options... 6 Multi-Screen Disply Function... 6 Movie Sending Mode...

More information

Epson iprojection Operation Guide (Windows/Mac)

Epson iprojection Operation Guide (Windows/Mac) Epson iprojection Opertion Guide (Windows/Mc) Contents 2 Introduction to Epson iprojection 5 Epson iprojection Fetures... 6 Connection to Vrious Devices... 6 Four-Pnel Disply... 6 Chnge Presenters nd Projection

More information

File Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment

File Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment File Mnger Quick Reference Guide June 2018 Prepred for the Myo Clinic Enterprise Khu Deployment NVIGTION IN FILE MNGER To nvigte in File Mnger, users will mke use of the left pne to nvigte nd further pnes

More information

vcloud Director Tenant Portal Guide vcloud Director 9.1

vcloud Director Tenant Portal Guide vcloud Director 9.1 vcloud Director Tennt Portl Guide vcloud Director 9.1 You cn find the most up-to-dte technicl documenttion on the VMwre website t: https://docs.vmwre.com/ If you hve comments bout this documenttion, submit

More information

vcloud Director Service Provider Admin Portal Guide vcloud Director 9.1

vcloud Director Service Provider Admin Portal Guide vcloud Director 9.1 vcloud Director Service Provider Admin Portl Guide vcloud Director 9. vcloud Director Service Provider Admin Portl Guide You cn find the most up-to-dte technicl documenttion on the VMwre website t: https://docs.vmwre.com/

More information

Service Guide PN 38L6645, EC M13180A

Service Guide PN 38L6645, EC M13180A IBM TS7610 nd TS7620 ProtecTIER Dedupliction Applince Express V3.3.6.1 Service Guide PN 38L6645, EC M13180A GA32-0915-10 PN 38L6645, EC M13180A, EC Dte 17 November, 2014 This edition pplies to ProtecTIER

More information

McAfee Network Security Platform

McAfee Network Security Platform NTBA Applince T-200 nd T-500 Quick Strt Guide Revision B McAfee Network Security Pltform 1 Instll the mounting rils Position the mounting rils correctly nd instll them t sme levels. At the front of the

More information

Simrad ES80. Software Release Note Introduction

Simrad ES80. Software Release Note Introduction Simrd ES80 Softwre Relese 1.3.0 Introduction This document descries the chnges introduced with the new softwre version. Product: ES80 Softwre version: 1.3.0 This softwre controls ll functionlity in the

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-188 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

Migrating vrealize Automation to 7.3 or March 2018 vrealize Automation 7.3

Migrating vrealize Automation to 7.3 or March 2018 vrealize Automation 7.3 Migrting vrelize Automtion to 7.3 or 7.3.1 15 Mrch 2018 vrelize Automtion 7.3 You cn find the most up-to-dte technicl documenttion on the VMwre website t: https://docs.vmwre.com/ If you hve comments bout

More information

McAfee Network Security Platform

McAfee Network Security Platform 10/100/1000 Copper Active Fil-Open Bypss Kit Guide Revision E McAfee Network Security Pltform This document descries the contents nd how to instll the McAfee 10/100/1000 Copper Active Fil-Open Bypss Kit

More information

pdfapilot Server 2 Manual

pdfapilot Server 2 Manual pdfpilot Server 2 Mnul 2011 by clls softwre gmbh Schönhuser Allee 6/7 D 10119 Berlin Germny info@cllssoftwre.com www.cllssoftwre.com Mnul clls pdfpilot Server 2 Pge 2 clls pdfpilot Server 2 Mnul Lst modified:

More information

Information regarding

Information regarding Informtion regrding LANCOM Advnced VPN Client 3.13 Copyright (c) 2002-2017 LANCOM Systems GmbH, Wuerselen (Germny) LANCOM Systems GmbH does not tke ny gurntee nd libility for softwre not developed, mnufctured

More information

COMPUTER EDUCATION TECHNIQUES, INC. (MS_W2K3_SERVER ) SA:

COMPUTER EDUCATION TECHNIQUES, INC. (MS_W2K3_SERVER ) SA: In order to lern which questions hve een nswered correctly: 1. Print these pges. 2. Answer the questions. 3. Send this ssessment with the nswers vi:. FAX to (212) 967-3498. Or. Mil the nswers to the following

More information

McAfee Network Security Platform

McAfee Network Security Platform Revision D McAfee Network Security Pltform (NS5x00 Quick Strt Guide) This quick strt guide explins how to quickly set up nd ctivte your McAfee Network Security Pltform NS5100 nd NS5200 Sensors in inline

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

c360 Add-On Solutions

c360 Add-On Solutions c360 Add-On Solutions Functionlity Dynmics CRM 2011 c360 Record Editor Reltionship Explorer Multi-Field Serch Alerts Console c360 Core Productivity Pck "Does your tem resist using CRM becuse updting dt

More information

vcloud Director Service Provider Admin Portal Guide 04 OCT 2018 vcloud Director 9.5

vcloud Director Service Provider Admin Portal Guide 04 OCT 2018 vcloud Director 9.5 vcloud Director Service Provider Admin Portl Guide 04 OCT 208 vcloud Director 9.5 You cn find the most up-to-dte technicl documenttion on the VMwre website t: https://docs.vmwre.com/ If you hve comments

More information

TECHNICAL NOTE MANAGING JUNIPER SRX PCAP DATA. Displaying the PCAP Data Column

TECHNICAL NOTE MANAGING JUNIPER SRX PCAP DATA. Displaying the PCAP Data Column TECHNICAL NOTE MANAGING JUNIPER SRX PCAP DATA APRIL 2011 If your STRM Console is configured to integrte with the Juniper JunOS Pltform DSM, STRM cn receive, process, nd store Pcket Cpture (PCAP) dt from

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

vcloud Director Tenant Portal Guide vcloud Director 9.0

vcloud Director Tenant Portal Guide vcloud Director 9.0 vcloud Director Tennt Portl Guide vcloud Director 9.0 vcloud Director Tennt Portl Guide You cn find the most up-to-dte technicl documenttion on the VMwre We site t: https://docs.vmwre.com/ The VMwre We

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Welch Allyn CardioPerfect Workstation Installation Guide

Welch Allyn CardioPerfect Workstation Installation Guide Welch Allyn CrdioPerfect Worksttion Instlltion Guide INSTALLING CARDIOPERFECT WORKSTATION SOFTWARE & ACCESSORIES ON A SINGLE PC For softwre version 1.6.6 or lter For network instlltion, plese refer to

More information

NetBackup 5200 Release 1.1 Quick Installation Guide

NetBackup 5200 Release 1.1 Quick Installation Guide NetBckup 5200 Relese 1.1 Quick Instlltion Guide Revision: 01 Dte: 2010-08-30 Environment Check Before Instlltion You cn instll the NetBckup 5200 pplince in stndrd 19-inch cbinet (with n AC distribution

More information

Troubleshooting Guide

Troubleshooting Guide IBM System Storage SAN Volume Controller Troubleshooting Guide GC27-2284-04 Note Before using this information and the product it supports, read the information in Notices on page 319. This edition applies

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

How to Design REST API? Written Date : March 23, 2015

How to Design REST API? Written Date : March 23, 2015 Visul Prdigm How Design REST API? Turil How Design REST API? Written Dte : Mrch 23, 2015 REpresenttionl Stte Trnsfer, n rchitecturl style tht cn be used in building networked pplictions, is becoming incresingly

More information

Zenoss Core Installation Guide

Zenoss Core Installation Guide Zenoss Core Instlltion Guide Relese 5.2.1 Zenoss, Inc. www.zenoss.com Zenoss Core Instlltion Guide Copyright 2017 Zenoss, Inc. All rights reserved. Zenoss nd the Zenoss logo re trdemrks or registered trdemrks

More information

Release Notes for. LANCOM Advanced VPN Client 4.10 Rel

Release Notes for. LANCOM Advanced VPN Client 4.10 Rel Relese Notes for LANCOM Advnced VPN Client 4.10 Rel Copyright (c) 2002-2018 LANCOM Systems GmbH, Wuerselen (Germny) LANCOM Systems GmbH does not tke ny gurntee nd libility for softwre not developed, mnufctured

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

EasyMP Network Projection Operation Guide

EasyMP Network Projection Operation Guide EsyMP Network Projection Opertion Guide Contents 2 About EsyMP Network Projection Functions of EsyMP Network Projection... 5 Vrious Screen Trnsfer Functions... 5 Instlling the Softwre... 6 Softwre Requirements...6

More information

McAfee Network Security Platform

McAfee Network Security Platform NS7x00 Quick Strt Guide Revision D McAfee Network Security Pltform This quick strt guide explins how to quickly set up nd ctivte your McAfee Network Security Pltform NS7100, NS7200, nd NS7300 Sensors in

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

IBM XIV Storage System Gen3 Models 281x-11x, 281x-21x, and 281x-314. Planning Guide IBM SC

IBM XIV Storage System Gen3 Models 281x-11x, 281x-21x, and 281x-314. Planning Guide IBM SC IBM XIV Storge System Gen3 Models 281x-11x, 281x-21x, nd 281x-314 Plnning Guide IBM SC27-5412-06 Note Before using this informtion nd the product it supports, red the informtion in Sfety nd environmentl

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

Zenoss Resource Manager Installation Guide

Zenoss Resource Manager Installation Guide Zenoss Resource Mnger Instlltion Guide Relese 6.1.1 Zenoss, Inc. www.zenoss.com Zenoss Resource Mnger Instlltion Guide Copyright 2018 Zenoss, Inc. All rights reserved. Zenoss, Own IT, nd the Zenoss logo

More information

STRM Installation Guide

STRM Installation Guide Security Thret Response Mnger Relese 2013.2 Juniper Networks, Inc. 1194 North Mthild Avenue Sunnyvle, CA 94089 USA 408-745-2000 www.juniper.net Published: 2014-09-15 Copyright Notice Copyright 2014 Juniper

More information

Installation and Upgrade on Windows Server 2008 When the Secondary Server is Virtual VMware vcenter Server Heartbeat 6.5 Update 1

Installation and Upgrade on Windows Server 2008 When the Secondary Server is Virtual VMware vcenter Server Heartbeat 6.5 Update 1 Instlltion nd Upgrde on Windows Server 2008 When the Secondry Server is Virtul VMwre vcenter Server Hertet 6.5 Updte 1 This document supports the version of ech product listed nd supports ll susequent

More information

Passwords Passwords Changing Passwords... <New Passwords> 130 Setting UIM PIN... <UIM PIN/UIM PIN2> 130 Unlocking a Locked UIM...

Passwords Passwords Changing Passwords... <New Passwords> 130 Setting UIM PIN... <UIM PIN/UIM PIN2> 130 Unlocking a Locked UIM... Psswords Psswords... 128 Chnging Psswords... 130 Setting UIM PIN... 130 Unlocking Locked UIM... 131 Restricting the Hndset Opertions Locking Function... 131 Locking the

More information

E201 USB Encoder Interface

E201 USB Encoder Interface Dt sheet Issue 3, 14 th July 2014 E201 USB Encoder Interfce E201-9Q - incrementl E201-9S - bsolute bsolute SSI BiSS-C mode (unidirectionl) B Z Clock Dt M SLO The E201 is single chnnel USB encoder interfce

More information

05-247r2 SAT: Add 16-byte CDBs and PIO modes 1 September 2005

05-247r2 SAT: Add 16-byte CDBs and PIO modes 1 September 2005 To: T10 Technicl Committee From: Robert Sheffield, Intel (robert.l.sheffield@intel.com) Dte: 1 September 2005 Subject: 05-247r2 SAT: Add 16-byte CDBs nd PIO modes Revision history Revision 0 (16 June 2005)

More information

Zenoss Core Installation Guide

Zenoss Core Installation Guide Zenoss Core Instlltion Guide Relese 6.1.0 Zenoss, Inc. www.zenoss.com Zenoss Core Instlltion Guide Copyright 2018 Zenoss, Inc. All rights reserved. Zenoss, Own IT, nd the Zenoss logo re trdemrks or registered

More information

E201 USB Encoder Interface

E201 USB Encoder Interface Dt sheet Issue 4, 24 th ugust 2015 E201 USB Encoder Interfce E201-9Q incrementl E201-9S bsolute bsolute SSI BiSS-C mode (unidirectionl) B Z Clock Dt M SLO The E201 is single chnnel USB encoder interfce

More information

NOTES. Figure 1 illustrates typical hardware component connections required when using the JCM ICB Asset Ticket Generator software application.

NOTES. Figure 1 illustrates typical hardware component connections required when using the JCM ICB Asset Ticket Generator software application. ICB Asset Ticket Genertor Opertor s Guide Septemer, 2016 Septemer, 2016 NOTES Opertor s Guide ICB Asset Ticket Genertor Softwre Instlltion nd Opertion This document contins informtion for downloding, instlling,

More information

JCM TRAINING OVERVIEW ivizion Banknote Acceptor

JCM TRAINING OVERVIEW ivizion Banknote Acceptor ivizion Bnknote Acceptor JCM Trining Overview Februry, 2018 Februry, 2018 ivizion-100 Prts List Prt Number - Description 701-000269R ivizion/uba Power Supply 701-100112R ivizion/tbv UAC Kit (lternte power

More information

IZT DAB ContentServer, IZT S1000 Testing DAB Receivers Using ETI

IZT DAB ContentServer, IZT S1000 Testing DAB Receivers Using ETI IZT DAB ContentServer, IZT S1000 Testing DAB Receivers Using ETI Appliction Note Rel-time nd offline modultion from ETI files Generting nd nlyzing ETI files Rel-time interfce using EDI/ETI IZT DAB CONTENTSERVER

More information

JCM TRAINING OVERVIEW DBV Series DBV-500 Banknote Validator

JCM TRAINING OVERVIEW DBV Series DBV-500 Banknote Validator November, 2016 JCM TRAINING OVERVIEW DBV Series DBV-500 Bnknote Vlidtor Phone (800) 683-7248 (702) 651-0000 Fx (702) 651-0214 E-mil support@jcmglobl.com www.jcmglobl.com 1 DBV-500 Bnknote Vlidtor Tble

More information

LoRaWANTM Concentrator Card Mini PCIe LRWCCx-MPCIE-868

LoRaWANTM Concentrator Card Mini PCIe LRWCCx-MPCIE-868 Dt Sheet LoRWANTM Concentrtor Crd Mini PCIe LRWCCxMPCIE868 LoRWANTM Concentrtor Crd bsed on Semtech SX30 nd SX308 Chips in Mini PCIe Form Fctor The nfuse LRWCCxMPCIE fmily of crds enble OEMs nd system

More information

Sage CRM 2017 R3 Software Requirements and Mobile Features. Updated: August 2017

Sage CRM 2017 R3 Software Requirements and Mobile Features. Updated: August 2017 Sge CRM 2017 R3 Softwre Requirements nd Mobile Fetures Updted: August 2017 2017, The Sge Group plc or its licensors. Sge, Sge logos, nd Sge product nd service nmes mentioned herein re the trdemrks of The

More information

Engineer-to-Engineer Note

Engineer-to-Engineer Note Engineer-to-Engineer Note EE-204 Technicl notes on using Anlog Devices DSPs, processors nd development tools Visit our Web resources http://www.nlog.com/ee-notes nd http://www.nlog.com/processors or e-mil

More information

Chapter 7. Routing with Frame Relay, X.25, and SNA. 7.1 Routing. This chapter discusses Frame Relay, X.25, and SNA Routing. Also see the following:

Chapter 7. Routing with Frame Relay, X.25, and SNA. 7.1 Routing. This chapter discusses Frame Relay, X.25, and SNA Routing. Also see the following: Chpter 7 Routing with Frme Rely, X.25, nd SNA This chpter discusses Frme Rely, X.25, nd SNA Routing. Also see the following: Section 4.2, Identifying the BANDIT in the Network Section 4.3, Defining Globl

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-208 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

Sage CRM 2018 R1 Software Requirements and Mobile Features. Updated: May 2018

Sage CRM 2018 R1 Software Requirements and Mobile Features. Updated: May 2018 Sge CRM 2018 R1 Softwre Requirements nd Mobile Fetures Updted: My 2018 2018, The Sge Group plc or its licensors. Sge, Sge logos, nd Sge product nd service nmes mentioned herein re the trdemrks of The Sge

More information

Zenoss Resource Manager Installation Guide

Zenoss Resource Manager Installation Guide Zenoss Resource Mnger Instlltion Guide Relese 5.2.5 Zenoss, Inc. www.zenoss.com Zenoss Resource Mnger Instlltion Guide Copyright 2017 Zenoss, Inc. All rights reserved. Zenoss nd the Zenoss logo re trdemrks

More information

EasyMP Multi PC Projection Operation Guide

EasyMP Multi PC Projection Operation Guide EsyMP Multi PC Projection Opertion Guide Contents 2 About EsyMP Multi PC Projection Meeting Styles Proposed by EsyMP Multi PC Projection... 5 Holding Meetings Using Multiple Imges... 5 Holding Remote Meetings

More information

McAfee Network Security Platform

McAfee Network Security Platform NS7x50 Quick Strt Guide Revision B McAfee Network Security Pltform This quick strt guide explins how to quickly set up nd ctivte your McAfee Network Security Pltform NS7150, NS7250, nd NS7350 Sensors in

More information

Enginner To Engineer Note

Enginner To Engineer Note Technicl Notes on using Anlog Devices DSP components nd development tools from the DSP Division Phone: (800) ANALOG-D, FAX: (781) 461-3010, EMAIL: dsp_pplictions@nlog.com, FTP: ftp.nlog.com Using n ADSP-2181

More information

Release Notes for. LANtools Software Release RU6

Release Notes for. LANtools Software Release RU6 Relese Notes for LANtools Softwre Relese 10.12 RU6 Copyright (c) 2002-2018 LANCOM Systems GmbH, Wuerselen (Germny) LANCOM Systems GmbH does not tke ny gurntee nd libility for softwre not developed, mnufctured

More information

Address/Data Control. Port latch. Multiplexer

Address/Data Control. Port latch. Multiplexer 4.1 I/O PORT OPERATION As discussed in chpter 1, ll four ports of the 8051 re bi-directionl. Ech port consists of ltch (Specil Function Registers P0, P1, P2, nd P3), n output driver, nd n input buffer.

More information

Upgrading from vrealize Automation to 7.3 or May 2018 vrealize Automation 7.3

Upgrading from vrealize Automation to 7.3 or May 2018 vrealize Automation 7.3 Upgrding from vrelize Automtion 6.2.5 to 7.3 or 7.3.1 03 My 2018 vrelize Automtion 7.3 You cn find the most up-to-dte technicl documenttion on the VMwre wesite t: https://docs.vmwre.com/ If you hve comments

More information

VoIP for the Small Business

VoIP for the Small Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become vible solution for even the

More information

VoIP for the Small Business

VoIP for the Small Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. TechAdvisory.org SME Reports sponsored by Voice over Internet Protocol (VoIP)

More information

VoIP for the Small Business

VoIP for the Small Business VoIP for the Smll Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become

More information

McAfee Network Security Platform

McAfee Network Security Platform Revision C McAfee Network Security Pltform (40 Gigit Active Fil-Open Bypss Kit Guide) McAfee Network Security Pltform IPS Sensors, when deployed in-line, route ll incoming trffic through designted port

More information

LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION

LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION Overview LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION 4.3.1.0 Due to the complex nture of this updte, plese fmilirize yourself with these instructions nd then contct RGB Spectrum

More information

Operational Verification. 21 AUG 2018 VMware Validated Design 4.3 VMware Validated Design for Software-Defined Data Center 4.3

Operational Verification. 21 AUG 2018 VMware Validated Design 4.3 VMware Validated Design for Software-Defined Data Center 4.3 Opertionl Verifiction 21 AUG 2018 VMwre Vlidted Design 4.3 VMwre Vlidted Design for Softwre-Defined Dt Center 4.3 Opertionl Verifiction You cn find the most up-to-dte technicl documenttion on the VMwre

More information

Zenoss Resource Manager Installation Guide

Zenoss Resource Manager Installation Guide Zenoss Resource Mnger Instlltion Guide Relese 5.3.0 Zenoss, Inc. www.zenoss.com Zenoss Resource Mnger Instlltion Guide Copyright 2017 Zenoss, Inc. All rights reserved. Zenoss, Own IT, nd the Zenoss logo

More information

VoIP for the Small Business

VoIP for the Small Business VoIP for the Smll Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become

More information

VoIP for the Small Business

VoIP for the Small Business VoIP for the Smll Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become

More information

Troubleshooting Guide

Troubleshooting Guide IBM System Storage SAN Volume Controller Version 6.4.0 Troubleshooting Guide GC27-2284-03 Note Before using this information and the product it supports, read the information in Notices on page 267. This

More information

Installation Guide AT-VTP-800

Installation Guide AT-VTP-800 Velocity 8 Touch Pnel The Atlon -BL nd -WH re 8 touch pnels in blck nd white, respectively, for the Atlon Velocity Control System. They feture contemporry, refined styling for modern presenttion environments

More information

VoIP for the Small Business

VoIP for the Small Business VoIP for the Smll Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become

More information

Registering as an HPE Reseller

Registering as an HPE Reseller Registering s n HPE Reseller Quick Reference Guide for new Prtners Mrch 2019 Registering s new Reseller prtner There re four min steps to register on the Prtner Redy Portl s new Reseller prtner: Appliction

More information

Zenoss Resource Manager Installation Guide

Zenoss Resource Manager Installation Guide Zenoss Resource Mnger Instlltion Guide Relese 5.3.2 Zenoss, Inc. www.zenoss.com Zenoss Resource Mnger Instlltion Guide Copyright 2017 Zenoss, Inc. All rights reserved. Zenoss, Own IT, nd the Zenoss logo

More information

Upgrading from vrealize Automation 6.2 to 7.1

Upgrading from vrealize Automation 6.2 to 7.1 Upgrding from vrelize Automtion 6.2 to 7.1 vrelize Automtion 7.1 This document supports the version of ech product listed nd supports ll susequent versions until the document is replced y new edition.

More information

Engineer-to-Engineer Note

Engineer-to-Engineer Note Engineer-to-Engineer Note EE-245 Technicl notes on using Anlog Devices DSPs, processors nd development tools Contct our technicl support t dsp.support@nlog.com nd t dsptools.support@nlog.com Or visit our

More information

Sage CRM 2017 R2 Software Requirements and Mobile Features. Revision: IMP-MAT-ENG-2017R2-2.0 Updated: August 2017

Sage CRM 2017 R2 Software Requirements and Mobile Features. Revision: IMP-MAT-ENG-2017R2-2.0 Updated: August 2017 Sge CRM 2017 R2 Softwre Requirements nd Mobile Fetures Revision: IMP-MAT-ENG-2017R2-2.0 Updted: August 2017 2017, The Sge Group plc or its licensors. Sge, Sge logos, nd Sge product nd service nmes mentioned

More information

VoIP for the Small Business

VoIP for the Small Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become vible solution for even the

More information

VoIP for the Small Business

VoIP for the Small Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become vible solution for even the

More information

Registering as a HPE Reseller. Quick Reference Guide for new Partners in Asia Pacific

Registering as a HPE Reseller. Quick Reference Guide for new Partners in Asia Pacific Registering s HPE Reseller Quick Reference Guide for new Prtners in Asi Pcific Registering s new Reseller prtner There re five min steps to e new Reseller prtner. Crete your Appliction Copyright 2017 Hewlett

More information

VoIP for the Small Business

VoIP for the Small Business VoIP for the Smll Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become

More information

VoIP for the Small Business

VoIP for the Small Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become vible solution for even the

More information

EasyMP Multi PC Projection Operation Guide

EasyMP Multi PC Projection Operation Guide EsyMP Multi PC Projection Opertion Guide Contents 2 About EsyMP Multi PC Projection Meeting Styles Proposed by EsyMP Multi PC Projection... 5 Holding Meetings Using Multiple Imges... 5 Holding Remote Meetings

More information

VoIP for the Small Business

VoIP for the Small Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become vible solution for even the

More information

Tool Vendor Perspectives SysML Thus Far

Tool Vendor Perspectives SysML Thus Far Frontiers 2008 Pnel Georgi Tec, 05-13-08 Tool Vendor Perspectives SysML Thus Fr Hns-Peter Hoffmnn, Ph.D Chief Systems Methodologist Telelogic, Systems & Softwre Modeling Business Unit Peter.Hoffmnn@telelogic.com

More information

HP Unified Functional Testing

HP Unified Functional Testing HP Unified Functionl Testing Softwre Version: 11.50 Enter the operting system(s), e.g. Windows Tutoril for GUI Testing Document Relese Dte: Decemer 2012 Softwre Relese Dte: Decemer 2012 Legl Notices Wrrnty

More information

VoIP for the Small Business

VoIP for the Small Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become vible solution for even the

More information

VoIP for the Small Business

VoIP for the Small Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become vible solution for even the

More information

3 Talk to Us First. Reasons You Should. Non-Contact Temperature Measurement Solutions

3 Talk to Us First. Reasons You Should. Non-Contact Temperature Measurement Solutions 3 Tlk to Us First Resons You Should Non-Contct Temperture Mesurement Solutions Reson # 1 Fixed System Spot Thermometers FACTS Verstile technologies re must for the ever expnding list of pplictions Lnd

More information

VoIP for the Small Business

VoIP for the Small Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become vible solution for even the

More information

Backup and Restore. 20 NOV 2018 VMware Validated Design 4.3 VMware Validated Design for Software-Defined Data Center 4.3

Backup and Restore. 20 NOV 2018 VMware Validated Design 4.3 VMware Validated Design for Software-Defined Data Center 4.3 20 NOV 2018 VMwre Vlidted Design 4.3 VMwre Vlidted Design for Softwre-Defined Dt Center 4.3 You cn find the most up-to-dte technicl documenttion on the VMwre wesite t: https://docs.vmwre.com/ If you hve

More information

VoIP for the Small Business

VoIP for the Small Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become vible solution for even the

More information

Setup Guide. * Values enclosed in [ ] are for when the stand is attached. ipf650/ipf655. ipf750/ipf755. d. 3-inch paper core attachment L

Setup Guide. * Values enclosed in [ ] are for when the stand is attached. ipf650/ipf655. ipf750/ipf755. d. 3-inch paper core attachment L Introductory Informtion Setup Guide ENG Red this mnul before ttempting to operte the printer. Keep this mnul in hndy loction for future reference. Introduction Cution Instructions in this Setup Guide show

More information