UCS C Series TAC Time. Andreas Nikas Technical Leader

Similar documents
Managing Cisco UCS C3260 Dense Storage Rack Server

Storage Controller Considerations

Overview. About the Cisco UCS S3260 System

Storage Controller Considerations

Cisco HyperFlex HX220c M4 Node

Achieve Optimal Network Throughput on the Cisco UCS S3260 Storage Server

Troubleshooting Server Hardware or Software. Issue. Troubleshooting Operating System and Drivers Installation

RAID Controller Considerations

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

NEC Express5800/R120e-1M System Configuration Guide

Cisco UCS C-Series Integrated Management Controller GUI Configuration Guide for C3X60 Servers

Data Center solutions for SMB

Supports up to four 3.5-inch SAS/SATA drives. Drive bays 1 and 2 support NVMe SSDs. A size-converter

Cisco Host Upgrade Utility 1.5(1) User Guide

Storage Controller Information

Release Notes for Cisco Integrated System for Microsoft Azure Stack, Release 1.0. Release Notes for Cisco Integrated System for Microsoft

Supported RAID Controllers and Required Cables

Release Notes for Cisco UCS C-Series Software, Release 2.0(13)

UCS-E160DP Double-wide E-Series Server, 6 core CPU, with PCIe

Using UCS-Server Configuration Utility

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

Cisco UCS S3260 System Storage Management

NEC Express5800/R120d-2M Configuration Guide

Front-loading drive bays 1 12 support 3.5-inch SAS/SATA drives. Optionally, front-loading drive bays 1 and 2 support 3.5-inch NVMe SSDs.

Upgrading Earlier Release Version Servers for Cisco UCS Manager Integration

Cisco UCS C240 M3 NEBS High-Density Rack Server (Small Form Factor Disk Drive Model)

NEC Express5800/R120h-2E System Configuration Guide

Altos R320 F3 Specifications. Product overview. Product views. Internal view

Altos T310 F3 Specifications

Cisco UCS S3260 System Storage Management

Release Notes for Cisco UCS Server Configuration Utility Release 2.2(2)

Hardware Monitoring. Monitoring a Fabric Interconnect

NEC Express5800/R120h-2M System Configuration Guide

Cisco UCS C24 M3 Server

Overview. About the Cisco UCS S3260 System

NEC Express5800/GT110e Configuration Guide

NEC Express5800/B120f-h System Configuration Guide

Release Notes for Cisco UCS Platform Emulator, Release 3.1(2bPE1)

UCS Architecture Overview

Using UCS-Server Configuration Utility

Acer AR320 F2 Specifications

Cisco HyperFlex HX220c Edge M5

Cisco UCS S3260 System Storage Management

NEC Express5800/B110d Configuration Guide

Troubleshooting Server Disk Drive Detection and Monitoring

Acer AR320 F1 specifications

Acer AW2000h w/aw170h F2 Specifications

Release Notes for Cisco Integrated Management Controller, Release 1.0(2)

NEC EXPRESS5800/R320a-E4 Configuration Guide

January 28 29, 2014San Jose. Engineering Workshop

C-Series Jason Shaw, UCS Technical Marketing Steve McQuerry, CCIE # 6108, UCS Technical Marketing

Release Notes for Cisco UCS Platform Emulator, Release 3.1(1ePE1)

Acer AT310 F2 Specifications

Release Notes for Cisco UCS Platform Emulator, Release 2.1(1aPE3)

NEC Express5800/R120e-2E Configuration Guide

NEC EXPRESS5800/R320a-E4 Configuration Guide

NEC Express5800/R120h-1M System Configuration Guide

Guide to SATA Hard Disks Installation and RAID Configuration

GW2000h w/gw175h/q F1 specifications

Figure 1 shows the appearance of Huawei FusionServer RH8100 V3 Rack Server. The RH8100 V3 provides three front I/O modules:

Cisco Connected Safety and Security UCS C220

Cisco IMC Firmware Management

NEC EXPRESS5800/R320b-M4 Configuration Guide

Cisco UCS C240 M3 Server

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Cisco Host Upgrade Utility 2.0 User Guide

Cisco UCS E-Series Servers

Asseco SEE Macedonia. Goran Acevski. Advanced Infrastructure Services. Cisco Gold Partner

NEC Express5800/T110h-S System Configuration Guide

NEC EXPRESS5800/R140b-4 Configuration Guide

Release Notes for Cisco UCS C-Series Software, Release 3.1(3)

NEC Express5800/B120d-h System Configuration Guide

NEC EXPRESS5800/R320b-M4 Configuration Guide

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

NEC Express5800/R120f-1M System Configuration Guide

NEC Express5800/R120h-2M System Configuration Guide

Cisco UCS C200 M2 High-Density Rack-Mount Server

Cisco UCS C240 M4 High-Density Rack Server (Small Form Factor Disk Drive Model)

NEC Express5800/R110j-1 System Configuration Guide

NEC Express5800/B120e-h System Configuration Guide

PrepKing. PrepKing

NEC Express5800/R120h-1E System Configuration Guide

Suggested use: infrastructure applications, collaboration/ , web, and virtualized desktops in a workgroup or distributed environments.

Cisco UCS C240 M3 Server

Express5800/ A2040c/A2020c/A2010c/A1040c Configuration Guide

NEC Express5800/GT110e-S Configuration Guide

Intel Xeon E v4, Windows Server 2016 Standard, 16GB Memory, 1TB SAS Hard Drive and a 3 Year Warranty

Cisco Host Upgrade Utility 3.0 User Guide

NEC EXPRESS5800/E110d-1 Configuration Guide

Updating the Firmware on Cisco UCS C-Series Servers

Change Log. 05/16/2014 Original release 2014 LENOVO. ALL RIGHTS RESERVED.

NEC Express5800/R120g-1E System Configuration Guide

Express5800/ A2040d/A2020d/A2010d/A1040d Configuration Guide

2013 LENOVO. ALL RIGHTS RESERVED.

Express5800/ A2040d/A2020d/A2010d/A1040d Configuration Guide

ThinkServer RD450 Platform Specifications

Cisco UCS C-Series IMC Emulator Quick Start Guide. Cisco IMC Emulator 2 Overview 2 Setting up Cisco IMC Emulator 3 Using Cisco IMC Emulator 9

NEC Express5800/GT110e-S Configuration Guide

RS U, 1-Socket Server, High Performance Storage Flexibility and Compute Power

Next Generation Computing Architectures for Cloud Scale Applications

Transcription:

UCS C Series TAC Time Andreas Nikas Technical Leader

Agenda Introduction Working with TAC Web Resources New Hardware New Software Troubleshooting Top Issues

Introduction Ken Krzyzewski Technical Leader Data Center Carlos Lopez Technical Leader Data Center Jose Martinez Technical Leader Data Center Andreas Nikas, Technical Leader Data Center Patrick Reardon C Series SME

Working with TAC

A problem clearly stated is a problem half solved. Charles F. Kettering Head of research GM

Information Gathering What to collect for the Service Request Detailed Problem description When did it start happening Is it reproducible? Logs: tech-support, OS logs, sosreports, mcelogs, LSI MegaCLI logs System Info: Software Version (CICM,BIOS,OS) Hardware (Server model, adapter types & version [enic/fnic]) Additional Info: Screenshots, topology etc

Analysis / Corrective Actions What corrective actions have been taken and what was the outcome? Has the system been rebooted? Has the system been upgraded? Check system supportability using the UCS HW and SW Interoperability Tool Check the UCS C - Series Field Notices for possible matches Check the Release Notes for known open caveats

Web Resources

Web Resources - Technical UCS C Series Field Notices http://www.cisco.com/en/us/products/ps10493/prod_field_notices_list.html UCS C Series Release Notes http://www.cisco.com/en/us/products/ps10739/prod_release_notes_list.html Compatibility Matrix http://www.cisco.com/en/us/products/ps10477/prod_technical_reference_list.html Storage Compatibility Matrix http://www.cisco.com/en/us/docs/switches/datacenter/mds9000/interoperability/matrix/ Matrix8.html Support Community https://supportforums.cisco.com/index.jspa

Web Resources - Tools Service Request http://www.cisco.com/cisco/web/support/index.html#~shp_contact Bug Search https://tools.cisco.com/bugsearch/?referring_site=shp Online Web Returns (POWR Tool) http://www.cisco.com/web/ordering/cs_info/or3/o32/return_a_product/webreturns/prod uct_online_web_returns.html RMA http://www.cisco.com/en/us/docs/rma/3582.html Cisco Notification Service http://www.cisco.com/cisco/support/notifications.html

Partner Resources Data Center Partner Portal https://www.myciscocommunity.com/community/partner/datacenter Partner Help Desk (800)-GO-CISCO

New Hardware

Single Server Dual CPU socket per server Up to 4GB RAID Cache Enterprise storage features Up to 256GB Memory 8 DIMMs per socket Dual Modular LOM (mlom) Multiple Connectivity Options Up to 62 Drive Bays 60 LFF, plus 2 SFF Optional Bezel UCS C3160 Dense Rack Server

HDD 4 Rows of hotswappable HDD 4TB/6TB Total top load: 56 drives Server Node 2x E5-2600 V2 CPUs 128/256GB RAM 1GB/4GB RAID Cache FAN 8 hot-pluggable fans Two 120GB SSDs OS/Boot Optional Disk Expansion 4x hot-swappable, rear-load LFF 4TB/6TB HDD System I/O Controller (SIOC) Cisco mlom Slot Power Supply 4 hot-pluggable PSUs

What Specifications Quantity Required Base Chassis: UCS C3160 Base Chassis, 2x 120GB SSDs, 4 PSU, 1 Rail Kit Min of 1 Chassis Server Node: 4 Workload Specific Configured Nodes Available Drives: Main Drive Bay Filled by rows of 14 Capacity: 224TB @ FCS 336TB Post-FCS SIOC: System I/O Controller with Cisco mlom Optional Disk Expansion Node: 16TB @ FCS/24TB Post FCS Physical Dimensions Stand-alone Management 2x E5-2620 V2/128 GB/1GB RAID Cache 2x E5-2620 V2/256 GB/4GB RAID Cache 2x E5-2660 V2/256 GB/4GB RAID Cache 2x E5-2695 V2/256 GB/4GB RAID Cache 14x 4TB 7200RPM LFF 28x 4TB 7200RPM LFF 42x 4TB 7200RPM LFF 56x 4TB 7200RPM LFF Cisco VIC 1227 10GbE SFP+ (Dual) Intel i350 mlom NIC (Quad 1GbE) Tray+4x 4TB 7200 LFF @ FCS, 4x6TB Post FCS 4U height / 31.8 inch depth Cisco Integrated Management Controller - CIMC Must choose 1 Server Node Must choose at least 1 row of disks. Note: Post-FCS 6TB rows will be available Requires min 1/max 2 SIOC Requires 1 NIC per SIOC Optional Note: All LFF Drives on the UCS C3160 has to be of the same capacity/type

Cisco IMC 2.0 Introduction Cisco IMC Firmware Release Naming: CIMC Cisco IMC (In adherence with Cisco branding rules) 1.X 2.X (This will be the 2.0 Release) Release will be branded Cisco IMC 2.0 Supported Platforms: C-Series 22 M3 & C-Series 24 M3 C-Series 220 M3 & C-Series 240 M3 C-Series M4 Platforms (Future) C-Series 3160

Cisco IMC 2.0 New Themes & Features Storage Configuration: Local storage management using XML API Advanced RAID Configuration Options via WebUI, CLI and XML API Networking: Cisco IMC IPv6 Support Dynamic DNS Issue ping from WebUI Monitoring: SNMP Phase4 (+ storage changes) Syslog Enhancements (Port) DIMM Blacklisting: Phase2 Fault Engine History Platform Event Traps (PET) removed, PEF Updated

Cisco IMC 2.0 New Themes & Features Security BIOS Signing (Signed Update Checking) Secure CIMC Support (Signed Update Checking) UEFI Secure Boot (Windows 2012) PSIRT fixes password hashing using standard algorithms (SHA512) General Precision Boot order control - CLI, Web UI, XML API support Import/Export Enhancements Display FW version of PCIe adapters Standardize BIOS tokens Local User Enhancements KVM Enhancements (Power Controls, Last Boot Capture, Digital Video Recorder Capability, Chat Capability) SCU - ESXi OS install and SAN based ESXi install

Raid Controller CSCuh86924 ESXi PSOD PF exception 14, LSI Raid Controller 9266-8i Problem is due to marginal voltage level on an internal voltage rail. Primarily impacts ESXi 4, ESXi 5 and Red Hat 6.4. MegaCLI lsi-fwterm.log will contain one or more of the following messages: "Pmu Msg Fault!!! faultcode 00002651 "Pmu Msg Fault!!! faultcode 00002656" "Pmu Msg Fault!!! faultcode 0000620B" "Controller encountered a fatal error and was reset" (multiple instances) Manufacturing New RMA

HDD CSCul25263 Seagate 146/300/500GB/1TB/2TB HDDs may stop responding Look for messages in the SEL logs and/or the OBFL logs. You may see a combination of the following messages: RAID / Controller Physical / Virtual Drive Shows as degraded or failed If you see any of the these messages listed in the example below then you may be hitting this issue. BMC:storage:-: SLOT-5: VD 01/1 is now DEGRADED "HDD_07_STATUS: Drive Slot sensor, Drive Fault was asserted". To verify this issue you will have to power cycle the HDD / Server If the issue clears then you have hit this issue and the workaround is temporary. Next, in the collected tech support find the following file /mnt/jffs2/storage-data

HDD - Continued CSCul25263 Seagate 146/300/500GB/1TB/2TB HDDs may stop responding The following is an excerpt from the storage-data file. The drive firmware needs to be upgraded. The key information is highlighted. %controller "SLOT-4" %type "RAID" %physical-drive "4" %group inquiry-data The following Seagate hard drives are affected: Seagate 146GB SAS 15K Hard Drive - ST9146853SS - Firmware version 0002 Seagate 300GB SAS 15K Hard Drive - ST9300653SS - Firmware version 0002 Seagate 500GB SATA 7.2K Hard Drive - ST500NM0011 - Firmware version 0001 Seagate 1TB SAS 7.2K Hard Drive - ST1000NM0001 - Firmware version 0001 Seagate 2TB SAS 7.2K Hard Drive - ST2000NM0001 - Firmware version 0001 +has-error: No +vendor: SEAGATE +product-id:st9300653ss +product-revision-level: 0002 <<======= Need Version 4 or >

Troubleshooting What if you still have HDD or Raid Controller Issues? For HDD issues Try reversing the LSI cable ends and check connectivity (Polarity Sensitive) Check \mnt\jffs2\fw_update and if one or more components are not up to date then use update_all Check \mnt\jffs2\storage-data for media errors and failures For Raid Controller issue Check Controller firmware and BIOS If they are not updated then check if HUU detects it and try and upgrade. Collect MegaCLi logs. In the latest code you can download the ttylogs in the same place as a show tech. In 2.0 some of the logs are included in the show tech itself.

Thank you.

Backup

Cisco IMC 2.0 Features DIMM Blacklisting Gets ECC counts from IPMI ECC sensor Makes blacklisting decisions based on IPMI sensor reading Blacklists DIMM when Uncorrectable ECC error is encountered Maps-out blacklisted DIMM in the subsequent reboot of Host by communicating the decision to BIOS Map-out means, DIMM is excluded from Host memory configuration Maintains ECC counts across reboots in the Blacklisting database Stores the Blacklisting database on DIMM SPD.

Cisco IMC 2.0 Features Continued HUU Low Level Firmware Update Till not low level components like FPGA, CPLD, etc. could only be updated via CIMC Cli and had to be done manually. With Eagle Peak release the Low Level Firmware will be updated along with HUU Update. Update All In this case no questions will be asked and Low Level Firmware, if require an update will be updated automatically when CIMC is activated after HUU update. Update CIMC If CIMC is selected with any other component in HUU for update the user will be prompted for Low Level Firmware update. If user selects yes then the on CIMC activation the components will be updated

Cisco IMC 2.0 Features Continued vkvm/vmedia Enhancements KVM and vmedia Reconnect Status Bar Host Power Control DVR and Video Player Exporting Recorded Video Server Side (BMC) Video Capture Video Scaling Mini-mode Chat

Troubleshooting Backup

Replacing a Failed Disk Do NOT replace a disk while the server is shut down! May result in an inactive or offline disk. Replace the disk while after the OS, or at very least RAID Controller option ROM has loaded. An inactive disk is when conflicting COD (configuration on disk) partitions are detected on multiple disks. As the controller does not know which is correct, it disables (makes inactive) one of the configurations. This should only happen during boot time (If the new disk was added when the Server is switched off). If the new disk is added when the Option ROM is already loaded (or the O/S is booted) then the Controller will know which of the disks is foreign (new) and will automatically start a rebuild (if required) onto it.

LSI MegaRaid GUI Install on Windows Unzip and go to the Disk1 Directory Run Setup Will need to install MS C++ components

LSI MegaRaid Physical View Start MegaRAID from Start menu Get this initial screen

LSI MegaRaid Physical View

LSI MegaRaid GUI Install on Linux A little bit harder than Windows install Unzip/tar download file Cd disk Run./install.sh Press Y to accept License Agreement Select 3 to install Standalone This will install Lib-Utils and Lib-Utils2 Might seen an error about snmpd you can ignore it GUI gets installed in /usr/local/megaraid Storage Manager Startupui.sh runs the gui Works/looks just like Windows GUI

LSI MegaRaid Physical View - Linux

LSI MegaRaid GUI Install on VMware Similar to Linux Install Unzip/tar download file Cd disk Run./vmware_install.sh Press Y to accept License Agreement Select the ESX version (3.5 or 4.x) Select N to use the inbox storage library This installs the server portion of the software You will need to load the full MegaRAID software on another host and point it to the ESX server

LSI MegaRaid VMware Remote Connect

Useful MegaRAID CLI Resources LSI MegaRAID SAS software User guide http://littleloubug.cisco.com/calif/mr_sas_sw_ug_80-00156-01_rev_j.pdf HWRAID Website http://hwraid.le-vert.net/wiki/lsimegaraidsas MegaRAID Cheat sheet http://tools.rapidsoft.de/perc/perc-cheat-sheet.html Search MegaRAID in your favorite search engine