Server-Related Faults

Similar documents
Memory-Related Faults

Cisco UCS Integrated Management Controller Faults Reference Guide

UCS-E160DP Double-wide E-Series Server, 6 core CPU, with PCIe

Managing Blade Servers

Managing Blade Servers

Troubleshooting. LEDs CHAPTER

Dell EMC OpenManage Message Reference Guide. Version 9.1

Troubleshooting. Introduction CHAPTER

Dell OpenManage Connection for Tivoli Enterprise Console Version 3.5. User s Guide. support.dell.com

The operator has activated this LED to identify this chassis. This chassis is not being identified. Fabric modules are all operational.

Server Administrator Version 8.5 Messages Reference Guide

Functional safety in BATTERY MANAGEMENT SYSTEMS

DGS-1510 Series Gigabit Ethernet SmartPro Switch Web UI Reference Guide. Figure 10-1 Cable Diagnostics window

Server Administrator Version 7.4 Messages Reference Guide

Hardware Monitoring. Monitoring a Fabric Interconnect

Managing the Capability Catalog in Cisco UCS Manager

Troubleshooting the Cisco RPS

Notice for Express Report Service (MG)

Troubleshooting the System Hardware

One of the following problems exist: At least one power supply LED is red. At least one power supply is down. Fan trays are all operational.

Troubleshooting Server Disk Drive Detection and Monitoring

Troubleshooting issues with Cisco IPMI Extensions

Introduction to UCS Faults

OpenManage Management Pack for vrealize Operations Manager version 1.1. User s Guide

OpenManage Management Pack for vrealize Operations Manager version 1.0 User s Guide

Dell PowerEdge T110 Systems. Hardware Owner s Manual

Cisco MDS 9250i Multiservice Fabric Switch Overview. Introduction CHAPTER

Managing the Servers

Chapter 3: Computer Assembly

Troubleshooting. General System Checks. Troubleshooting Components. Send documentation comments to

CDE460/470 IPMI Firmware Upgrade. Reviewers. Modification History

Managing Rack-Mount Servers

UCSM Fault Management

Server Utilities. Enabling Or Disabling Smart Access USB. This chapter includes the following sections:

MITAC Desktop Board PD10TI Product Guide

MITAC Desktop Board PD12TI Product Guide

Power Management in Cisco UCS

Messages and Recovery Procedures

BIOS Update Release Notes

Viewing Faults and Logs

High Availability and Redundant Operation

Anybus CompactCom 40 Diagnostic Events for EtherNet/IP

Managing the Capability Catalog in Cisco UCS Manager

Troubleshooting CHAPTER

Chapter 3: System Configuration

Manage the Capability Catalog in Cisco UCS Manager

ThinkSystem SR650 Messages and Codes Reference

Dell Server Management Pack Suite Version 6.1 for Microsoft System Center Operations Manager User's Guide

TelePresence MCU Hardware Troubleshoot Guide

Dell PowerVault MD3060e Storage Enclosure Owner's Manual

Monitoring Notifications

Overview. About the Cisco UCS S3260 System

MAINTENANCE MANUAL. EDACS REDUNDANT POWER SUPPLY SYSTEM 350A1441P1 and P2 POWER MODULE CHASSIS 350A1441P3, P4, AND P5 POWER MODULES TABLE OF CONTENTS

Cisco MCS-78XX Boot Error Codes

Industrial Network UPS Firmware Release Notes

Troubleshooting Server Hardware or Software. Issue. Troubleshooting Operating System and Drivers Installation

Cisco CDE465 DIMM Reseating and Replacement Procedure. Modification History

Send documentation feedback to

Managing Modules. About Modules. Send documentation comments to CHAPTER

Overview of Cisco 5520 Wireless Controller

Dell EMC Server Management Pack Suite Version 7.0 for Microsoft System Center Operations Manager. User's Guide

Dell EMC Unity Family

Contents. Introduction. Prerequisites. Overview. Requirements. Components Used

A33606-PCI-01 SAS-2 Expander User Manual. Version: Document Number:

BIOS Update Release Notes

BIOS. Chapter The McGraw-Hill Companies, Inc. All rights reserved. Mike Meyers CompTIA A+ Guide to Managing and Troubleshooting PCs

Intel /100Mbps Ethernet Controller 32bit PCI Slot x2. ATI Rage XL Video Chip with 4MB Video RAM onboard 64bit PCI Slot x4

Configuring the Embedded Event Manager

Vess A2000 Series SR1.1 Release Notes

January 28 29, 2014 San Jose. Engineering Workshop

CDE250 IPMI Firmware v3.12 Upgrade. Modification History

BIOS Update Release Notes PRODUCTS: DQ67SW, DQ67OW, DQ67EP (Standard BIOS)

Managing the Server. You must log in with user or admin privileges to perform this task. Command or Action

Release Notes for Cisco UCS Platform Emulator, Release 2.1(1aPE3)

A+ Guide to Hardware: Managing, Maintaining, and Troubleshooting, 5e. Chapter 4 Supporting Processors

iscsi sub-system event log

Question: 1 You have a Cisco UCS cluster and you must recover a lost admin password. In which order must you power cycle the fabric interconnects?

PEM Faults and Blower Failures

Cisco Exam Questions & Answers

BIOS Update Release Notes

BIOS Update Release Notes

Troubleshooting Initial Startup Problems

Dell Client Management Pack Version 6.0 for Microsoft System Center Operations Manager User s Guide

DOCUMENTATION FOR PC EMERGENCY CONTROL DEVICE USB WATCHDOG TIMER

Configuring Platform Event Filters

IS 258 PC Maintenance. Lecture 7: Installing, Upgrading and Troubleshooting Processor Instructor: Henry Kalisti

Integrate APC Smart UPS

Intel Entry Storage System SS4000-E

Embedded Event Manager System Events and Configuration Examples

Troubleshooting. Resetting the System. Problems Following Initial System Installation. First Steps Checklist CHAPTER

Troubleshooting. Diagnosing Problems

Managing Power in Cisco UCS

MasterFlux 48V Cascade BLDC Motor Controller Product Specification 030F0137

Automated Out-of-Band management with Ansible and Redfish

Installation and Troubleshooting Guide

Innodisk ismart Windows 4.0.X

BIOS Update Release Notes

BIOS Update Release Notes

BIOS Update Release Notes

System CMOS/BIOS Configuration PC Diagnostics

Transcription:

This chapter contains the following sections: fltadapterunitmissing, page 2 fltcomputeboardcmosvoltagethresholdcritical, page 2 fltcomputeboardcmosvoltagethresholdnonrecoverable, page 3 fltcomputeboardmotherboardvoltagelowerthresholdcritical, page 4 fltcomputeboardmotherboardvoltagethresholdlowernonrecoverable, page 4 fltcomputeboardmotherboardvoltagethresholduppernonrecoverable, page 5 fltcomputeboardmotherboardvoltageupperthresholdcritical, page 6 fltcomputeboardpowererror, page 6 fltcomputeboardpowerfail, page 7 fltcomputeboardpowerusageproblem, page 8 fltcomputeboardthermalproblem, page 8 fltcomputeiohubthermalnoncritical, page 9 fltcomputeiohubthermalthresholdcritical, page 10 fltcomputeiohubthermalthresholdnonrecoverable, page 10 fltcomputephysicalbiosposttimeout, page 11 fltcomputephysicalpostfailure, page 12 fltcomputephysicalunidentified, page 12 fltequipmenttpmtpmmismatch, page 13 fltmgmtifmissing, page 14 fltpowerbudgetpowerbudgetbmcproblem, page 14 fltpowerbudgetpowerbudgetcmcproblem, page 15 1

fltadapterunitmissing fltadapterunitmissing F0203 [sensor_name]:[id] missing: reseat or replace [id]. This fault occurs when the adapter is missing in the adapter slot, or when the endpoint cannot detect or communicate with the adapter. 1 Make sure the adapter is inserted properly in the adapter slot. 2 Check whether the adapter is connected, configured, and running the recommended firmware version. Severity: warning Cause: equipment-missing mibfaultcode: 203 mibfaultname:fltadapterunitmissing moclass: compute:adapter Type: equipment fltcomputeboardcmosvoltagethresholdcritical F0424 Battery voltage level is upper critical: Replace battery. This fault occurs when the CMOS battery voltage drops lower than the normal operating range. The low battery voltage might affect the clock and other CMOS settings. 2

fltcomputeboardcmosvoltagethresholdnonrecoverable If you see this fault, replace the CMOS battery. Before replacing this component, see the server-specific Installation and Service Guide for prerequisites, safety recommendations, and warnings. Severity: critical Cause: voltage-problem mibfaultcode: 424 mibfaultname: fltcomputeboardcmosvoltagethresholdcritical fltcomputeboardcmosvoltagethresholdnonrecoverable F0425 Battery voltage level is upper non-recoverable: Replace battery. This fault indicates that the CMOS battery voltage has dropped and is unlikely to recover. The low voltage impacts the clock and other CMOS settings. If you see this fault, replace the CMOS battery. Before replacing this component, see the server-specific Installation and Service Guide for prerequisites, safety recommendations and warnings. Severity: major Cause: voltage-problem mibfaultcode: 425 mibfaultname: fltcomputeboardcmosvoltagethresholdnonrecoverable 3

fltcomputeboardmotherboardvoltagelowerthresholdcritical fltcomputeboardmotherboardvoltagelowerthresholdcritical F0921 You see one of the following messages when this fault is raised: Stand-by voltage ([Val] V) to the motherboard is lower critical: Check the power supply. Auxiliary voltage ([Val] V) to the motherboard is lower critical: Check the power supply. Motherboard voltage ([Val] V) is lower critical: Check the power supply. This fault indicates that one or more motherboard input voltages have crossed lower critical thresholds. 1 Reseat or replace the power supply. Before replacing this component, see the server-specific Installation and Service Guide for prerequisites, safety recommendations and warnings. 2 If the issue persists, create a tech-support file and contact TAC. Severity: major Cause: voltage-problem mibfaultcode: 921 mibfaultname: fltcomputeboardmotherboardvoltagelowerthresholdcritical moclass: compute: Board Type:environmental fltcomputeboardmotherboardvoltagethresholdlowernonrecoverab F0919 You see one of the following messages when this fault is raised: 4

fltcomputeboardmotherboardvoltagethresholduppernonrecoverable Stand-by voltage ([Val] V) to the motherboard is lower non-recoverable: Check the power supply. Auxiliary voltage ([Val] V) to the motherboard is lower non-recoverable: Check the power supply. Motherboard voltage ([Val] V) is lower non-recoverable: Check the power supply. This fault indicates that one or more motherboard input voltages has dropped too low and is unlikely to recover. If you see this fault, create a tech-support file and contact Cisco TAC. Severity: critical Cause: voltage-problem mibfaultcode: 919 mibfaultname: fltcomputeboardmotherboardvoltagethresholdlowernonrecoverable moclass:compute: Board fltcomputeboardmotherboardvoltagethresholduppernonrecover F0918 You see one of the following messages when this fault is raised: Stand-by voltage ([Val] V) to the motherboard is upper non-recoverable: Check the power supply. Motherboard voltage ([Val] V) is upper non-recoverable: Check the power supply. Auxiliary voltage ([Val] V) to the motherboard is upper non-recoverable: Check the power supply. This fault indicates that one or more motherboard input voltages are high and are unlikely to recover. If you see this fault, create a tech-support file and contact Cisco TAC. Severity: critical Cause: voltage-problem 5

fltcomputeboardmotherboardvoltageupperthresholdcritical mibfaultcode: 918 mibfaultname: fltcomputeboardmotherboardvoltagethresholduppernonrecoverable fltcomputeboardmotherboardvoltageupperthresholdcritical F0920 You see one of the following messages when this fault is raised: Stand-by voltage (xv) to the motherboard is upper critical: Check the power supply. Auxiliary voltage (xv) to the motherboard is upper critical: Check the power supply. Motherboard voltage (xv) is upper critical: Check the power supply. This fault indicates that one or more motherboard input voltages have exceeded upper critical thresholds. 1 Reseat or replace the power supply. 2 If the issue persists, create a tech-support file and contact Cisco TAC. Severity: major Cause: voltage-problem mibfaultcode: 920 mibfaultname: fltcomputeboardmotherboardvoltageupperthresholdcritical moclass: compute: Board fltcomputeboardpowererror F0310 6

fltcomputeboardpowerfail P[Id]V[Id]_AU[Id]_PWRGD: Voltage rail Power Good dropped due to PSU or HW failure, please contact CISCO TAC for assistance. This fault indicates that the server power sensors have detected a problem. 1 Reseat or replace the power supply. Before replacing this component, see the server-specific Installation and Service Guide for prerequisites, safety recommendations, and warnings. 2 If the recommended action did not resolve the issue, create a tech-support file and contact Cisco TAC. Severity: major Cause: power-problem mibfaultcode: 310 mibfaultname: fltcomputeboardpowererror fltcomputeboardpowerfail F0868 The server failed to power on: Check Power Supply. This fault indicates that the power sensors on the server have detected a problem. If you see this fault, create a tech-support file and contact Cisco TAC. Severity: critical Cause: power-problem 7

fltcomputeboardpowerusageproblem mibfaultcode: 868 mibfaultname: fltcomputeboardpowerfail fltcomputeboardpowerusageproblem F1040 You see one of the following messages when this fault is raised: Motherboard Power usage is upper critical: Check hardware. Motherboard Power usage is upper non-recoverable: Check hardware. This fault occurs when the motherboard power consumption exceeds a certain threshold limit. If you see this fault, create a tech-support file and contact Cisco TAC. Severity: warning Cause: power-problem mibfaultcode: 1040 mibfaultname: fltcomputeboardpowerusageproblem fltcomputeboardthermalproblem F0869 Motherboard chipset inoperable due to high temperature. 8

fltcomputeiohubthermalnoncritical This fault indicates that the motherboard thermal sensors on the server have detected a problem. 1 Verify that the server fans are working properly. 2 Wait for 24 hours to see if the problem resolves itself. 3 If the problem still persists, create a tech-support file and contact Cisco TAC. Severity: major Cause: thermal-problem mibfaultcode: 869 mibfaultname: fltcomputeboardthermalproblem fltcomputeiohubthermalnoncritical F0538 [sensor_name]: Motherboard chipset temperature is upper non-critical. This fault indicates that the I/O controller temperature is outside the upper or lower non-critical threshold. 1 Monitor other environmental events related to this server and make sure that the temperature is within the recommended range. 2 If the problem still persists, create a tech-support file and contact Cisco TAC. Severity: minor Cause: thermal-problem mibfaultcode: 538 9

fltcomputeiohubthermalthresholdcritical mibfaultname: fltcomputeiohubthermalnoncritical moclass: compute:iohub fltcomputeiohubthermalthresholdcritical F0539 [sensor_name]: Motherboard chipset temperature is upper critical. This fault occurs when the I/O controller temperature is outside the upper or lower critical threshold. 1 Monitor other environmental events related to the server and make sure that the temperature is within the recommended range. 2 Consider turning off the server for a while if possible. 3 If the problem still persists, create a tech-support file and contact Cisco TAC. Severity: major Cause: thermal-problem mibfaultcode: 539 mibfaultname: fltcomputeiohubthermalthresholdcritical moclass: compute:iohub fltcomputeiohubthermalthresholdnonrecoverable F0540 [sensor_name]: Motherboard chipset temperature is upper non-recoverable. 10

fltcomputephysicalbiosposttimeout This fault indicates that the I/O controller temperature is outside the recoverable range of operation. 1 Shut down the server immediately. 2 Create a tech-support file and contact Cisco TAC. Severity: critical Cause: thermal-problem mibfaultcode: 540 mibfaultname: fltcomputeiohubthermalthresholdnonrecoverable moclass: compute:iohub fltcomputephysicalbiosposttimeout F0313 BIOS POST Timeout occurred: Contact Cisco TAC. This fault indicates that the server did not complete the BIOS POST. 1 Connect to the CIMC Web UI and launch the KVM console to monitor the BIOS POST completion. 2 If the problem still persists, create a tech-support file and contact Cisco TAC. Severity: critical Cause: equipment-inoperable mibfaultcode: 313 mibfaultname: fltcomputephysicalbiosposttimeout 11

fltcomputephysicalpostfailure moclass: compute:physical Type: equipment fltcomputephysicalpostfailure F0517 [sensor_name]: BIOS POST Failed: Check hardware. This fault indicates that the server has encountered a diagnostic failure or an error during POST. 1 Check the POST result for the server. 2 Reboot the server. 3 If the problem still persists, create a tech-support file and contact Cisco TAC. Severity: critical Cause: equipment-problem mibfaultcode: 517 mibfaultname: fltcomputephysicalpostfailure moclass: compute:physical Type: server fltcomputephysicalunidentified F0320 [sensor_name]: server [id] Chassis Intrusion detected: Please secure the server chassis. This fault indicates that the server chassis or cover is open. 12

fltequipmenttpmtpmmismatch Make sure that the server chassis/cover is in place. Severity: warning Cause: equipment-problem mibfaultcode: 320 mibfaultname: fltcomputephysicalunidentified moclass: equipment: Chassis Type: equipment fltequipmenttpmtpmmismatch F1783 PM_FAULT_STATUS: Check TPM, either wrong TPM revision installed for CPU type or previously installed TPM has been removed. This fault indicates that a wrong TPM has been installed or a previously installed TPM has been removed. 1 If an incorrect revision of the TPM has been installed, remove the TPM. 2 Install the correct revision of the TPM. 3 If the problem still persists, create a tech-support file and contact Cisco TAC. Severity: warning Cause: equipment-inoperable mibfaultcode: 1783 mibfaultname: fltequipmenttpmtpmmismatch Type: equipment 13

fltmgmtifmissing fltmgmtifmissing F0717 Link Down : <Interface> Check the network cable connection Here <Interface> can be one of the following: DEDICATED_MODE_<port> LOM_ACTIVE_STANDBY_<port> LOM_ACTIVE_ACTIVE_<port> CISCO_CARD_ACTIVE_STANDBY_<port> CISCO_CARD_ACTIVE_ACTIVE_<port> LOM10G_ACTIVE_STANDBY_<port> LOM10G_ACTIVE_ACTIVE_<port> LOM_EXT_MODE_<port> This fault indicates that the corresponding interface cable is not connected. 1 Check whether the interface cable is connected properly. 2 If the problem persists, create a tech-support file and contact Cisco TAC. Severity: info Cause: link-missing mibfaultcode: 717 mibfaultname: fltmgmtifmissing fltpowerbudgetpowerbudgetbmcproblem F0637 14

fltpowerbudgetpowerbudgetcmcproblem Power capping failed: System shutdown is initiated by Node Manager. This fault indicates that the assigned power-cap value is not maintained. If the power-cap fail exception action is set as shutdown, then the host shut down is initiated. If you see this fault, take the following action: 1 Disable the corresponding power profile in the Power Cap Configuration page and power on the host. 2 Increase the power-cap value in the Power Cap profile page for which the shutdown action is configured. 3 If the assigned power-cap value needs to be maintained (irrespective of the host performance impact), reduce the load on the host. 4 If the problem still persists, create a tech-support file and contact Cisco TAC. Severity: major Cause: power-cap-fail mibfaultcode: 637 mibfaultname: fltpowerbudgetpowerbudgetbmcproblem fltpowerbudgetpowerbudgetcmcproblem F0635 Power capping correction time exceeded: Please set an appropriate power limit. This fault indicates that the assigned power-cap value is not attainable for the correction time set. 1 Increase the power-cap value and the power limiting correction time in the corresponding power-profile settings. 15

fltpowerbudgetpowerbudgetcmcproblem 2 If the problem still persists, create a tech-support file and contact Cisco TAC. Severity: major Cause: power-cap-fail mibfaultcode: 635 mibfaultname: fltpowerbudgetpowerbudgetcmcproblem 16