DELL EMC VXRAIL TM APPLIANCE OPERATIONS GUIDE

Similar documents
Dell EMC VxRail Appliance

Dell EMC VxRail Appliance

Table of Contents HOL HCI

vsan Remote Office Deployment January 09, 2018

Native vsphere Storage for Remote and Branch Offices

VMware vsphere Clusters in Security Zones

vsan Security Zone Deployment First Published On: Last Updated On:

VMWARE VSAN LICENSING GUIDE - MARCH 2018 VMWARE VSAN 6.6. Licensing Guide

Administering VMware Virtual SAN. Modified on October 4, 2017 VMware vsphere 6.0 VMware vsan 6.2

VMware vsan 6.6. Licensing Guide. Revised May 2017

TITLE. the IT Landscape

VxRail: Level Up with New Capabilities and Powers

The Impact of Hyper- converged Infrastructure on the IT Landscape

2014 VMware Inc. All rights reserved.

vsan Management Cluster First Published On: Last Updated On:

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

VxRail: Level Up with New Capabilities and Powers GLOBAL SPONSORS

Microsoft SQL Server 2014 on VMware vsan 6.2 All-Flash October 31, 2017

Administering VMware vsan. 17 APR 2018 VMware vsphere 6.7 VMware vsan 6.7

Dell Technologies IoT Solution Surveillance with Genetec Security Center

Administering VMware vsan. Modified on October 4, 2017 VMware vsphere 6.5 VMware vsan 6.6.1

Table of Contents HOL SLN

Modern hyperconverged infrastructure. Karel Rudišar Systems Engineer, Vmware Inc.

vsan Planning and Deployment Update 1 16 OCT 2018 VMware vsphere 6.7 VMware vsan 6.7

VxRack SDDC Deep Dive: Inside VxRack SDDC Powered by VMware Cloud Foundation. Harry Meier GLOBAL SPONSORS

Microsoft SQL Server 2014 on vsan 6.2 All-Flash December 15, 2017

Introducing VMware Validated Designs for Software-Defined Data Center

Introducing VMware Validated Designs for Software-Defined Data Center

Vision of the Software Defined Data Center (SDDC)

What's New in vsan 6.2 First Published On: Last Updated On:

VMware vsan 6.6 Technical Overview First Published On: Last Updated On:

The Impact of Hyper- converged Infrastructure on the IT Landscape

vsan Disaster Recovery November 19, 2017

What's New in VMware vsan 6.6 First Published On: Last Updated On:

vsan Mixed Workloads First Published On: Last Updated On:

VMware vsphere Administration Training. Course Content

[VMICMV6.5]: VMware vsphere: Install, Configure, Manage [V6.5]

1V Number: 1V0-621 Passing Score: 800 Time Limit: 120 min. 1V0-621

VMware Virtual SAN. Technical Walkthrough. Massimiliano Moschini Brand Specialist VCI - vexpert VMware Inc. All rights reserved.

VMware vsphere 6.5: Install, Configure, Manage (5 Days)

VMware vsphere: Install, Configure, Manage plus Optimize and Scale- V 6.5. VMware vsphere 6.5 VMware vcenter 6.5 VMware ESXi 6.

Introducing VMware Validated Designs for Software-Defined Data Center

HCI mit VMware vsan Radikal einfach und vollständig in die SDDC Strategie integriert

Converged Platforms and Solutions. Business Update and Portfolio Overview

Running VMware vsan Witness Appliance in VMware vcloudair First Published On: April 26, 2017 Last Updated On: April 26, 2017

vsan Stretched Cluster & 2 Node Guide January 26, 2018

Introduction to Virtualization. From NDG In partnership with VMware IT Academy

VxRAIL for the ClearPath Software Series

Detail the learning environment, remote access labs and course timings

StarWind Virtual SAN. HyperConverged 2-Node Scenario with Hyper-V Cluster on Windows Server 2012R2. One Stop Virtualization Shop MARCH 2018

Installation and Cluster Deployment Guide for VMware

VMware vsphere with ESX 4.1 and vcenter 4.1

vrealize Operations Management Pack for vsan 1.0 Guide

Free up rack space by replacing old servers and storage

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND FIBRE CHANNEL INFRASTRUCTURE

Delivering HCI with VMware vsan and Cisco UCS

VMware vsan 6.5 Technical Overview January 24, 2017

VMware vsan 6.5 Technical Overview December 15, 2017

Cisco HyperFlex Hyperconverged Infrastructure Solution for SAP HANA

Dell EMC. VxRack System FLEX Architecture Overview

VxRail: Level Up with New Capabilities and Powers

What s New in VMware Virtual SAN (VSAN) v 0.1c/AUGUST 2013

Eliminate the Complexity of Multiple Infrastructure Silos

Installation and Cluster Deployment Guide for VMware

A Dell Technical White Paper Dell Virtualization Solutions Engineering

Virtual Storage Console, VASA Provider, and Storage Replication Adapter for VMware vsphere

Paul Hodge June 18 th Virtualization Update Slimmed Down, Stretched Out and Simplified

Migration. 22 AUG 2017 VMware Validated Design 4.1 VMware Validated Design for Software-Defined Data Center 4.1

vsphere Networking Update 1 ESXi 5.1 vcenter Server 5.1 vsphere 5.1 EN

VMware vsphere with ESX 4 and vcenter

VMware - VMware vsphere: Install, Configure, Manage [V6.7]

DELL EMC TECHNICAL SOLUTION BRIEF. ARCHITECTING A DELL EMC HYPERCONVERGED SOLUTION WITH VMware vsan. Version 1.0. Author: VICTOR LAMA

Converged and Hyper-Converged: Factory-Integrated Data Protection for Simplicity and Lifecycle Assurance

VMWARE CLOUD FOUNDATION: INTEGRATED HYBRID CLOUD PLATFORM WHITE PAPER NOVEMBER 2017

VMWARE VIRTUAL SAN: ENTERPRISE-GRADE STORAGE FOR HYPER- CONVERGED INFRASTRUCTURES CHRISTOS KARAMANOLIS RAWLINSON RIVERA

VMware Cloud Foundation Overview and Bring-Up Guide. Modified on 27 SEP 2017 VMware Cloud Foundation 2.2

Virtual SAN and vsphere w/ Operations Management

PBO1064BU VxRack SDDC Deep Dive: Inside VxRack SDDC Powered by VMware Cloud Foundation Jason Marques, Dell EMC Georg Edelmann, VMware VMworld 2017 Con

VMware vsphere with ESX 6 and vcenter 6

VMware vsphere: Install, Configure, Manage (vsphere ICM 6.7)

Virtual Storage Console, VASA Provider, and Storage Replication Adapter for VMware vsphere

Dell Storage vsphere Web Client Plugin. Version 4.0 Administrator s Guide

VMware vrealize Operations Federation Management Pack 1.0. vrealize Operations Manager

Table of Contents VSSI VMware vcenter Infrastructure...1

VMware Join the Virtual Revolution! Brian McNeil VMware National Partner Business Manager

VMware vsphere: Fast Track [V6.7] (VWVSFT)

Table of Contents HOL-SDC-1412

Reference Architecture: Lenovo Client Virtualization with VMware Horizon and System x Servers

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND ISCSI INFRASTRUCTURE

VMware vsphere 6.5 Boot Camp

VMware Virtual SAN. High Performance Scalable Storage Architecture VMware Inc. All rights reserved.

Installation and Cluster Deployment Guide

Introducing VMware Validated Designs for Software-Defined Data Center

VMWARE vsan 6.7. vsphere vsan. Managed by vcenter. #1 with Cloud Providers 1. vsan Shared Storage. Why VMware vsan?

VxRack FLEX Technical Deep Dive: Building Hyper-converged Solutions at Rackscale. Kiewiet Kritzinger DELL EMC CPSD Snr varchitect

"Charting the Course... VMware vsphere 6.7 Boot Camp. Course Summary

vsan 6.6 Performance Improvements First Published On: Last Updated On:

Dell EMC Best Practices for Running VMware ESXi 6.5 or Later Clusters on XC Series Appliances and XC Core Systems

vsphere Networking Update 2 VMware vsphere 5.5 VMware ESXi 5.5 vcenter Server 5.5 EN

VMware vsphere 5.5 Advanced Administration

Transcription:

DELL EMC VXRAIL TM APPLIANCE OPERATIONS GUIDE A Hyper-Converged Infrastructure Appliance from Dell EMC and VMware PART NUMBER: H16788 ABSTRACT This document describes how to perform day-to-day operations on a VxRail Appliance environment after the system has been installed and configured. The target audience for this document includes customers, field personnel, and partners who manage and operate a VxRail Appliance. November 2017 Dell EMC VxRail Appliance Operations Guide

Contents VxRail overview... 6 Hardware... 7 vsphere... 7 VMware vcenter Server... 7 VMware ESXI hypervisor... 8 VMware virtual networking... 8 vsan... 8 Storage policy based management (SPBM)... 9 VxRail Manager... 9 vrealize Log Insight... 9 VxRail support... 10 Dell EMC support account... 10 Registering for online support... 10 Accessing Dell EMC Support from VxRail Manager... 10 Dell EMC Secure Remote Services (ESRS)... 11 Confirming ESRS connection... 11 Accessing Support eservices... 11 Opening a live chat session with support personnel... 11 Contacting Customer Service to request service... 12 VxRail community forums... 12 Support knowledge base... 12 SolVe Desktop application for VxRail series... 12 Monitoring a VxRail Appliance using VxRail Manager... 13 Monitoring logical system health... 14 Procedure... 14 Viewing physical system health... 15 Procedure... 15 Powering down and powering up a VxRail Appliance... 17 Shutting down a VxRail cluster... 17 Starting up VxRail... 18 Viewing system events using VxRail Manager... 19 Procedure... 20 VxRail Market... 21 Procedure... 21 Managing a VxRail cluster using vcenter... 23 VxRail vcenter options... 23 VxRail Manager deployed vcenter... 23 Customer deployed vcenter... 23 2 Dell EMC VxRail Appliance Operations Guide

vsan monitoring and management... 25 Running vsan health checks... 25 Procedure... 26 vsan Storage Policy Based Management... 27 Default storage policy... 29 Defining a new VM storage policy... 30 Assigning a storage policy to a VM... 31 Changing a storage policy for a VM... 31 Renaming VxRail objects within vcenter... 33 Renaming a VxRail datacenter... 33 Renaming a VM folder... 33 Renaming vsan datastore... 33 Renaming a VxRail vsphere Distributed Switch... 34 Renaming VxRail dvportgroup... 34 Choosing a failure tolerance method (FTM)... 35 Mirroring... 35 Erasure coding... 36 FTM considerations... 38 Configuring IOPS limit for objects (QoS)... 39 Configuring IOPS limit for a VM... 39 vsan fault domains... 41 Configuring fault domains... 42 VxRail deduplication and compression... 43 Considerations... 44 Enabling deduplication and compression... 44 Procedure... 45 Disabling deduplication and compression... 46 Procedure... 46 Monitoring deduplication and compression... 46 vsan encryption services... 48 Setting up vsan encryption on a VxRail cluster... 48 Setting up a domain of trust... 49 Enabling encryption for the VxRail cluster... 51 Connectivity to key management server... 53 VxRail networking... 54 Physical network... 54 VxRail VLANs... 55 Enabling multicast... 56 3 Dell EMC VxRail Appliance Operations Guide

Virtual machine and optional data services networks... 56 Optional NIC configurations... 56 Changing network configuration... 56 Migrating workloads onto VxRail... 57 Considerations... 57 Using vsphere vmotion... 57 VMware vsphere Replicator... 58 Dell EMC RecoverPoint for VM... 59 Connecting iscsi storage to VxRail... 59 Procedure... 60 Connecting NFS storage to VxRail... 60 Procedure... 61 Using Dell EMC CloudArray... 61 Conclusion... 61 Resource utilization and performance monitoring... 62 Monitoring system capacity, workload activity, and performance... 62 VxRail Manager Logical Health view... 63 Monitoring storage capacity using vcenter web client... 64 Using native vsphere performance monitoring... 65 Viewing CPU and memory performance metrics... 65 Enabling vsan performance data collection... 67 Viewing vsan performance metrics... 68 Understanding vsan performance metrics... 69 Advanced workload and performance analysis... 72 Creating a vsan Observer Performance Statistics bundle... 72 VxRail software upgrades... 74 System scale-out upgrades... 76 Adding a node to an existing cluster... 76 Drive expansion procedure... 79 Procedure... 79 Replacing failed components... 82 Identifying failed components... 82 Replacing a capacity drive... 83 Procedure... 83 Replacing a cache drive... 84 Procedure... 84 Replacing a power supply... 85 Procedure... 85 4 Dell EMC VxRail Appliance Operations Guide

Preface This document describes how to perform day-to-day operations on a VxRail environment after the system has been installed and configured. Audience The target audience for this document includes customers, field personnel, and partners who manage and operate a VxRail Appliance. Related resources and documentation This document references other documentation resources available on https://support.emc.com/, vmware.com, and the Solve Desktop application. 5 Dell EMC VxRail Appliance Operations Guide

VxRail overview Jointly developed by Dell EMC and VMware, the VxRail Appliance is the only fully integrated, preconfigured, and tested hyper-converged infrastructure (HCI) appliance powered by VMware vsan technology for software-defined storage. Managed through the well-known VMware vcenter Server interface, VxRail provides a familiar vsphere experience that enables streamlined deployment and extends the use of existing IT tools and processes. VxRail Appliances are fully loaded with integrated, mission-critical data services from Dell EMC and VMware. Services includes compression, deduplication, replication, backup and many more. VxRail delivers resiliency and centralized-management functionality that enables simpler management of consolidated workloads, virtual desktops, business-critical applications, and remote-office infrastructure. As the only HCI appliance from Dell EMC and VMware, VxRail is the easiest way to stand up a fully virtualized VMware environment. VxRail is the only HCI appliance on the market that fully integrates Intel-based Dell EMC PowerEdge Servers with VMware vsphere and vsan. VxRail is jointly engineered with VMware and is supported as a single product delivered by Dell EMC. VxRail seamlessly integrates with existing VMware eco-system and cloud management solutions including vrealize, NSX, and Horizon. The VxRail Manager provides lifecycle management for the VxRail software bundle that includes: VxRail Manager user interface for configuration and deployment, software and hardware updates, system monitoring and alerting VMware vsphere, including ESXi VMware vcenter Server VMware vsan VMware vrealize Log Insight The VxRail Appliance also includes complementary advanced capabilities, including Dell EMC CloudArray and Dell EMC RecoverPoint for Virtual Machines. VxRail provides an entry point to the software-defined data center (SDDC) for most workloads. Customers of all sizes and types can benefit from VxRail, including small- and medium-sized environments, remote and branch offices (ROBO), and edge departments. VxRail also provides a solid infrastructure foundation for larger data centers. For more detailed information on VxRail, go to https://www.emc.com/en-us/convergedinfrastructure/vrail. 6 Dell EMC VxRail Appliance Operations Guide

Hardware The VxRail Appliance consists of nodes in a rackmount chassis. Each node has compute and storage resources. For a list of available VxRail Appliance models, refer to the Dell EMC VxRail website. One or more network switches (10GbE or 1GbE depending on the model), appropriate cables, and a workstation/laptop for the user interface are also required. vsphere The VMware vsphere software suite delivers virtualization in a highly available, resilient, ondemand infrastructure making it the ideal software foundation for the VxRail Appliance. ESXi and vcenter Server are core components of vsphere. ESXi is a hypervisor installed on a physical VxRail server node in the factory and enables a single physical server to host multiple logical servers or virtual machines (VMs). VMware vcenter server is the management application for ESXi hosts and VMs. Customers can use existing eligible vsphere licenses with their VxRail, or the licenses can be purchased with a VxRail Appliance. This VxRail vsphere license-independent model (also called bring your own or BYO vsphere License model) allows customers to leverage a wide variety of vsphere licenses they have already purchased. Several vsphere license editions are supported with VxRail including Enterprise+, Standard, and ROBO editions. (vsphere Enterprise is also supported, but is no longer available from VMware). Also supported are vsphere licenses from Horizon bundles or add-ons when the appliance is dedicated to VDI. If vsphere licenses need to be purchased, they can be ordered through Dell EMC, the customer s preferred VMware channel partner, or from VMware directly. Licenses acquired through VMware ELA, VMware partners, or Dell EMC receive single-call support from Dell EMC. VMware vcenter Server vcenter Server is the primary point of management for both server virtualization and vsan storage. A single vcenter instance can scale to enterprise levels, supporting hundreds of VxRail nodes and thousands of virtual machines. See vcenter Server Maximums documentation from VMware at https://docs.vmware.com/ for current limits. vcenter supports a logical hierarchy of datacenters, clusters, and hosts. This hierarchy facilitates resources being segregated by use case or lines of business, and allows resources to move dynamically as needed. This is all done from a single intuitive interface. vcenter Server provides VM and resource services, such as inventory service, task scheduling, statistics logging, alarm and event management, and VM provisioning and configuration. vcenter Server is also responsible for advanced features including: vsphere vmotion - Enables live VM workload migration with zero downtime vsphere Distributed Resource Scheduler (DRS) - Continuously balances and optimizes VM compute resource allocation across nodes in the cluster 7 Dell EMC VxRail Appliance Operations Guide

vsphere High Availability (HA) Provides virtual machine (VM) failover and restart capabilities VMware ESXI hypervisor In VxRail, the ESXi hypervisor deploys and services VMs on cluster nodes. VMs are secure and portable. Each VM is a complete system with processors, memory, networking, storage and BIOS. VMs are highly available and the underlying storage is transparent. Moving a VM s virtual disk from one type of storage to another has no effect on the VM s function. VMs are isolated from one another, so when a guest operating system running on a VM fails, other VMs on the same physical host are not affected and continue to run. VMs share access to CPUs, and ESXi is responsible for CPU scheduling. In addition, ESXi assigns VMs a region of usable memory and provides shared access to the physical network cards and disk controllers associated with the physical host. All X86-based operating systems are supported. Different virtual machines can run different operating systems and applications on the same physical hardware. VMware virtual networking vsphere s virtual networking capabilities are part of the ESXi server and managed by vcenter. Virtual networks can be built on a single ESXi server host or across multiple ESXi server hosts. VxRail VMs communicate with each other using the VMware Virtual Distributed Switch (VDS), which functions as a single logical switch that spans multiple nodes in the same cluster. VDS uses standard network protocols and VLAN implementations, and it forwards frames at the data-link layer. VDS is configured in vcenter Server at the datacenter level, maintaining a consistent network configuration as VMs migrate across multiple hosts. The VxRail Appliance relies on VDS for appliance traffic, and vsan relies on VDS for its storage-virtualization capabilities. vsan VxRail Appliances leverage VMware s vsan for enterprise-class software-defined storage. vsan aggregates the locally attached disks of hosts in a vsphere cluster to create a pool of distributed shared storage. Capacity is scaled up by adding additional disks to the cluster and scaled out by adding additional VxRail nodes. vsan is fully integrated with vsphere, and it works seamlessly with other vsphere features. vsan is notable for its efficiency and performance. Built directly into the ESXi hypervisor at the kernel layer, it has very little impact on CPU utilization (less than 10 percent). vsan is selfoptimizing and balances allocation based on workload, utilization and resource availability. vsan delivers a high performance, flash-optimized, resilient hyper-converged infrastructure suitable for a variety of workloads. Enterprise-class storage features include: Efficient data-reduction technology, including deduplication and compression as well as erasure coding QoS policies to control workload consumption based on user-defined limits 8 Dell EMC VxRail Appliance Operations Guide

Data-integrity and data-protection technology, including software checksum and fault domains Enhanced security with native vsan data-at-rest-encryption VxRail provides two different vsan node-storage configuration options: a hybrid configuration that uses both flash SSDs and mechanical HDDs, and an all-flash SSD configuration. The hybrid configuration uses flash SSDs for caching and mechanical HDDs for capacity and persistent data storage. The all-flash configuration uses flash SSDs for both the caching and capacity. vsan is configured when the VxRail system is first initialized and is managed through vcenter. During the VxRail Appliance initialization process, vsan creates a distributed shared datastore from the locally attached disks on each ESXi node. The amount of storage in the datastore is an aggregate of all the capacity drives in the cluster. The orchestrated vsan configuration and verification performed as part of system initialization ensures consistent and predictable performance and a system configuration that follows best practices. Storage policy based management (SPBM) vsan is policy driven and designed to simplify storage provisioning and management. vsan policies define VM storage requirements for performance and availability. Policy assignments are generated based on a ruleset. Administrators can dynamically change a VM storage policy as requirements change. Examples of SPBM rules are the number of faults to tolerate and the data protection technique to use. VxRail Manager VxRail Manager provides monitoring and lifecycle management for physical infrastructure. VxRail Manager streamlines deployment, configuration. VxRail Manager also integrates Dell EMC services and support to help customers get the most value from the VxRail Appliance. vrealize Log Insight Bundled with VxRail, VMware vrealize Log Insight monitors system events and provides ongoing holistic notifications about the state of virtual environment and appliance hardware. vrealize Log Insight delivers real-time automated log management for the VxRail Appliance with log monitoring, intelligent grouping, and analytics to simplify troubleshooting at scale across VxRail physical, virtual, and cloud environments. 9 Dell EMC VxRail Appliance Operations Guide

VxRail support Dell ProSupport provides single-source support for all VxRail hardware and software components. Dell ProSupport provides priority access to highly trained VxRail engineers available around the clock and across the globe, so that you can maintain the highest level of productivity and minimize disruption. Available by phone, chat, or instant message, Dell ProSupport includes access to online support tools and documentation, rapid on-site parts delivery and replacement, access to new software versions, assistance with operating environment updates, and remote monitoring, diagnostics and repair through Dell EMC Secure Remote Services (ESRS). Dell EMC support account A Dell EMC Support account provides direct online access to the following resources: Product license files and software updates Product documentation SolVe Desktop with detailed procedures component replacement and software upgrades VxRail community Users with a Dell EMC Support account can add support credentials to VxRail Manager and then access support resources directly from VxRail Manager. Registering for online support 1. Point your web browser to emc.com/vxrailsupport (or support.emc.com). 2. Click Register here. 3. Fill in the required information. Dell EMC Support sends you a confirmation email, typically within 48 hours. Accessing Dell EMC Support from VxRail Manager The VxRail Manager Support tab displays support status information, as well as support resources and links. The screenshot below shows the operations available from the Support tab. 10 Dell EMC VxRail Appliance Operations Guide

Figure 1. VxRail Manager Support tab Dell EMC Secure Remote Services (ESRS) Dell EMC Secure Remote Services (ESRS) are configured when the VxRail Appliance is set up. ESRS provides remote monitoring, diagnostics and repair by the Dell EMC Support team. A Dell EMC support account is required before ESRS can be configured. Confirming ESRS connection VxRail Manager uses a heartbeat to track the last time the system communicated with the remote support service. The date and time of the last ESRS communication is displayed in the Last Heartbeat section on the Support view of VxRail Manager. Accessing Support eservices You can access Support eservices directly from VxRail Manager. Services include: Chat with Support: Opens a chat session with a Dell EMC support representative. Open a Service Request: Opens a web form where you can open a service request. VxRail Community: Displays the most recent activity from the VxRail Series community forums. Opening a live chat session with support personnel Initiate a chat session directly from VxRail Manager by clicking the Chat with Support button on the VxRail Manager Support tab. A live chat session opens with a Dell EMC support representative who knows both the VxRail hardware and the full VMware software stack. 11 Dell EMC VxRail Appliance Operations Guide

Contacting Customer Service to request service As an alternative to a live chat session, you can submit a service request directly from the VxRail Manager Service view. The Service Request dialog asks you to select the appliance for which you want to submit a request and pre-populates the form with information about that appliance. Enter the service request information and click Submit. VxRail community forums The VxRail community of users provides valuable insight on how other users are obtaining the most value from the VxRail Appliance. The community includes other VxRail users and provides a forum to post questions and share experiences. This community is open and, while not moderated by Dell EMC, subject matter experts in different organizations within the company offer their perspectives. To access the community, browse to https://community.emc.com/community/products/vxrail. Alternatively, the most recent activity from the VxRail community is shown in the VxRail Community list on the VxRail Manager Support view. To join the conversation, click the title of a message or article and view the topic in a new browser tab. Support knowledge base The support knowledge base is often the fastest way to identify solutions to problems or get answers to questions. Access the support knowledge base from support.emc.com or from the Support view in VxRail Manager. Type the search terms in the Knowledge Base search field and press Enter. SolVe Desktop application for VxRail series The SolVe Desktop application provides step-by-step procedures for component replacement, software upgrades, and other system support tasks performed by the customer. Download the application from https://support.emc.com. When the application is accessed, it verifies that it includes the most current procedures and automatically updates the content if needed. Dell EMC encourages you to use the detailed procedures in the Solve Desktop to reduce risk when performing changes to a VxRail environment. To download the SolVe Desktop application, go to https://support.emc.com. Click SolVe on the main page. Download and install the SolVe Desktop application on your computer. You must have an online support account to use the SolVe Desktop application. 12 Dell EMC VxRail Appliance Operations Guide

Monitoring a VxRail Appliance using VxRail Manager VxRail Manager implements a top-down approach to monitoring. The main dashboard (shown below) displays the current overall system health at a glance. Figure 2. VxRail Manager main dashboard From the main dashboard, you can: Access real-time system health details for both logical and physical resources. Use VxRail Manager dashboards to view resource operability, performance, and utilization data as well as information about cluster- and node-level storage, memory capacity and consumption. Locate and identify critical events, errors, and warnings on any appliance in the cluster. VxRail leverages VMware vrealize Log Insight to monitor system events and provide ongoing notifications about the state of virtual environment and appliance hardware. Access Support eservices directly from VxRail Manager. Services include chatting with Support, opening a service request, and connecting with the VxRail community. 13 Dell EMC VxRail Appliance Operations Guide

Monitoring logical system health The Logical tab of the VxRail Manager Health window displays CPU, memory, and storage usage for the entire cluster and individual nodes. An example is shown below. Figure 3. VxRail Manager Health > Logical screen Procedure 1. Click HEALTH > Logical. The default view shows cluster-level utilization levels for storage IOPS, CPU usage, and memory. The view is color-coded to enable you to identify resource utilization: Red: More than 85% used Yellow: 75 to 85% used Green: Less than 75% used 2. Click on a node name to view information about that node. 3. Click the components of a node to view more information about the capacity disk (HDD, SSD), cache disk (SSD), ESXi disk or NIC. 14 Dell EMC VxRail Appliance Operations Guide

Viewing physical system health The Physical tab of the VxRail Manager Health window displays information about the system hardware. A graphical representation of the nodes in your cluster makes it easy to navigate to event and status information. An example of this view is show below. Figure 4. VxRail Manager Health > Physical screen Cluster view Use the Health > Physical tab to view: Nodes in the cluster: View status of the cluster and information about each individual node such as ID and model number. Individual node components: Drill down to see status and information for appliance components such as disks, compute nodes, and power supplies. Procedure 1. In VxRail Manager, click HEALTH to view the overall health of the nodes that make up the cluster. 2. Click a node name or the picture of node to view more information about that node. Front and back views of the appliance are displayed. An example is shown below. 15 Dell EMC VxRail Appliance Operations Guide

Figure 5. VxRail Manager Health > Physical screen cluster view 3. If a status icon is displayed next to an appliance, click the appliance or the magnifying glass icon to see more information. 4. Click any appliance component to view more details. Click a disk in the Front View or Back View to see disk status and information. Click a node in the Back View to see compute and network information. Click a power supply in the Back View to see power supply status and information. Click the Back View to see compute information. Click a NIC in the Back View (E, S, P, and V models) to see network information. 5. If a status icon is displayed for a component, click it to view event details in the Health window. 6. Use your browser's back button to return to the appliance view on the Health > Physical tab. 16 Dell EMC VxRail Appliance Operations Guide

Powering down and powering up a VxRail Appliance The VxRail Manager Shut Down Cluster feature provides a simple, graceful shutdown for the entire cluster. Detailed procedures are available in the SolVe Desktop for specific environments. Note: Dell EMC highly recommends that you gracefully shut down all client VMs before performing this procedure. VxRail Manager only shuts down the infrastructure VMs in the cluster. The operator is responsible for properly shutting down all client VMs. Shutting down a VxRail cluster 1. In VxRail Manager left panel, click CONFIG and navigate to the General view. 2. Scroll down the page until you see the Shut Down Cluster option. Click Shut Down. Figure 6. Shut Down Cluster 3. At the confirmation prompt click Confirm. VxRail Manager performs a pre-check procedure and displays the following results. If the pre-checks run successfully, the Shut Down button is visible. 17 Dell EMC VxRail Appliance Operations Guide

Figure 7. Shut Down Cluster Pre-checks 4. Click Shut Down. Starting up VxRail 1. Verify that the top-of-rack (TOR) switch is powered on and connected to the network. 2. Power on each node manually by pressing its power button. Wait several minutes for all service VMs to be powered on automatically. The locator LED of Node1 is turned off automatically when VxRail Manager starts. 3. Using VCenter, manually restart all client VMs. Note: If vcenter and VxRail Manager VMs are not available in 10 minutes, use the vsphere client to log into host 1 to check the status of these VMs. 18 Dell EMC VxRail Appliance Operations Guide

Viewing system events using VxRail Manager The EVENTS view of VxRail Manager allows you to locate and identify critical events, errors, and warnings on any appliance in the cluster. The figure below is an example of the Events view. Figure 8. VxRail Manager Events view The event severity levels are: Critical: Immediate action is required. Take immediate action to prevent downtime or data loss. If critical events are detected, the EVENTS icon displays the number of unread events in red in the navigation bar. Error: An error has occurred. Address the issue as soon as possible. Warning: System needs attention. An issue such as a disk space limit being reached requires attention. 19 Dell EMC VxRail Appliance Operations Guide

Procedure 1. Click EVENTS to navigate to the VxRail Manager Events tab. Sort the events list as desired by clicking on a column heading. Sort by ID number, Severity, or Time (including Date). Use the arrow buttons and scroll bars to navigate through the events list. Click a row to view more information about an event. New critical events are shown in red. When you click an event, the red highlight is removed. 2. To set all critical events as read, click Mark All as Read. 3. If a physical component is listed in the Component column, click its Component ID to navigate to the Event Details view on the Health > Physical screen. 4. To download a list of events, click Export all events. An events.csv file is created and downloaded by your browser. 20 Dell EMC VxRail Appliance Operations Guide

VxRail Market Use the VxRail Market to download, install, and upgrade qualified software products for your appliance. Choose from a list of applications that add functionality, protection, or management to your VxRail Appliance. Software licenses may be required for these optional applications. The following figure shows an example of this dialog. Figure 9. VxRail Market config > Market tab Procedure 1. In VxRail Manager navigation pane, click CONFIG and open the Market tab 2. Scroll the Available Applications for this Cluster list of all applications available for your VxRail Appliance. The list includes a description and version number for each application. Filter the application list using the filter selector at the top of the list, if desired. Click Learn more if you want to view information about the application. The application page opens in a separate browser tab. 3. In VxRail Manager, install the application on your appliance: Click Install to install an application directly. Click Download to navigate to an external web page where you can download and install the application. 4. Multiple instances of an application can be installed. To view and manage instances, do the following: 21 Dell EMC VxRail Appliance Operations Guide

Click the arrow next to the name of the application to expand the instance listing. If there is only one instance of an application, the expand arrow is not displayed. Scroll to the instance you want to manage. The following status messages may be displayed for an application instance: Downloading: The application is currently downloading. Pending: The application is waiting until another application finishes downloading. IP address: The application is installed. 22 Dell EMC VxRail Appliance Operations Guide

Managing a VxRail cluster using vcenter vsphere Web Client is the primary management platform for virtual machines (VMs), ESXi hosts and vsan storage in a VxRail environment. VxRail vcenter options There are two approaches to deploying vcenter in a VxRail environment. You can use an existing customer-deployed vcenter where the VxRail is managed as another cluster, or deploy a dedicated vcenter by VxRail Manager during the VxRail initial setup. VxRail Manager deployed vcenter If the VxRail appliance is a standalone environment, then configuring a vcenter instance as part of system initialization is the easiest approach. When the VxRail system is deployed, vcenter server and the Platform Service Controller are deployed as a virtual machine. This vcenter is internal to the VxRail system and can only manage the VxRail cluster on which it is deployed. The VxRail software package includes the license for this vcenter instance. The license is only used when VxRail Manager deploys it when the system is initialized. The license is nontransferable, and cannot be used for any other purpose. Figure 10. Each VxRail Cluster Managed by VxRail Manager deployed vcenter Server Customer deployed vcenter Multiple VxRail and non-vxrail clusters can be managed from an existing customer deployed vcenter instance, providing a consolidated view of the virtualization environment. The vcenter Server environment is hosted outside of the VxRail cluster. Each VxRail environment appears within vcenter Server as a cluster of hosts configured with a vsan datastore. The vcenter Server must exist before deploying the VxRail Appliance and requires a separate customer-provided license. The lifecycle management of the customer-supplied vcenter Server is performed outside of the VxRail environment. 23 Dell EMC VxRail Appliance Operations Guide

A customer deployed vcenter that is hosted outside of the VxRail environment is required when configuring VxRail in a stretch cluster configuration and for environments where vsan encryption will be enabled. The figure below shows an example of a customer deployed vcenter Server managing multiple VxRail and non-vxrail clusters. Figure 11. VxRail Managed by customer deployed vcenter The decision to use a VxRail Manager-deployed vcenter or a customer-deployed vcenter is made during system planning prior to installation. Once the system has been deployed, changing vcenter deployment options requires the assistance of Dell EMC services. 24 Dell EMC VxRail Appliance Operations Guide

vsan monitoring and management vcenter is the primary interface for monitoring and managing vsan. From the vsphere web client, you can view the configuration and status of: Cluster hosts and storage devices Cluster and node capacity and consumption Datastore capacity Deduplication and compression efficiency Virtual disk status Physical device status vsan cluster health Running vsan health checks vsan health checks allow you to assess the status of cluster components, diagnose issues, and troubleshoot problems. The health checks cover hardware compatibility, network configuration and operation, advanced vsan configuration options, storage device health, and VM objects. Health checks can be configured to run at specified intervals. The figure below is an example of VSAN health check results. Figure 12. vsan health checks 25 Dell EMC VxRail Appliance Operations Guide

Procedure 1. From the vsphere web client, navigate to the vsan cluster. 2. Click Monitor and then click Virtual SAN. 3. Select Health to review the vsan health check categories. If the Test Result column displays Warning or Failed, expand the category to review the results of the individual health checks. 7. Select an individual health check and check the detailed information at the bottom of the page. Click the Ask VMware button to open a knowledge base article that describes the health check and provides information about how to resolve the issue. 26 Dell EMC VxRail Appliance Operations Guide

vsan Storage Policy Based Management vsan Storage Policy Based Management (SPBM) streamlines storage provisioning and simplifies storage management. A default storage policy is assigned when a VM is created. The policy defines how an object is configured on the vsan datastore and defines performance and availability. vsan monitors and reports on compliancy with the policy. If requirements change, you can modify the assigned policy or assign a different policy. You can do this can online without impacting availability. The figure below shows the relationship between SPBM policy and a virtual machine, and how the policy impacts the capacity, availability and performance of a virtual machine. Figure 13. Storage Policy Based Management A default storage policy is configured when the system is initialized. You can use the vsphere web client to configure new policies based on a set of rules. The table below summarizes vsan policy rules. Table 1. vsan policy rules vsan SPBM Rule Failures to tolerate (FTT) Description and Recommendation Defines number of host, disk or network failures that a storage object can tolerate. When Failure Tolerance Method is set to Mirroring to tolerate n failures; n+1 copies of the object are created and 2n+1 hosts (or fault domains) are required. When the Failure Tolerance Method is Erasure Coding and Failure to Tolerate=1, RAID5 3+1 is used and 4 hosts (or fault domains) are required. If Failure to Tolerate =2, RAID6 4+2 is used and 6 hosts (or fault domains) are required. Default value is 1. Maximum value is 3. 27 Dell EMC VxRail Appliance Operations Guide

See the Fault Domains section for information on protecting VMs from rack failures. Disable object checksum Object space reservation Number of disk stripes per object Flash read cache reservation Failure tolerance method (FTM) vsan uses end-to-end checksum to validate on read that the data is the same as what was written. If checksum verification fails, data is read from an alternate copy and data integrity is restored by overwriting the incorrect data with the correct data. Checksum calculation and error-correction are performed as background operations and are transparent to the virtual machine. The default setting for all objects in the cluster is No, which means checksum is enabled. As a best practice, keep software checksum enabled. Software checksum can be disabled if an application already provides a data integrity mechanism. Disabling checksum is an immediate operation. Re-enabling checksum requires a full data copy to apply the checksum, which can be resource- and time intensive. Specifies a percentage of the logical size of a storage object that is reserved when a VM is provisioned. The default value is 0%, which results in a thin provisioned volume. The maximum is 100%, which results in a thick volume. The value should be set either to 0% or 100% when using RAID-5/6 in combination with deduplication and compression. Establishes the minimum number of capacity devices used for striping a replica of a storage object. A value higher than 1 may result in better performance, but may consume more system resources. The default value is 1. The minimum value is 1. The maximum value is 12. vsan may decide that an object needs to be striped across multiple disks without any stripe-width policy requirement. While the reasons for this vary, it typically occurs when a virtual machine disk (VMDK) is too large to fit on a single physical drive. If a specific stripe width is required, it should not exceed the number of disks available to the cluster. Refers to flash capacity reserved as read cache for a virtual machine object and applies only to hybrid configurations. By default, vsan dynamically allocates read cache to storage objects based on demand. While there is typically no need to change the default 0 value for this parameter, a small increase in the read cache for a VM can sometimes significantly improve performance. Use this parameter with caution to avoid wasting resources or taking resources from other VMs. The maximum value is 100 percent. Specifies whether the data protection method is Mirroring or Erasure Coding. RAID-1 (Mirroring) provides better performance and consumes less memory and fewer network resources, but uses more disk space. RAID 5/6 (Erasure Coding) provides more usable capacity, but consumes more CPU and network resources. 28 Dell EMC VxRail Appliance Operations Guide

IOPS limit for object Force provisioning Establishes Quality of Service (QoS) for an object by defining an upper limit on the number of IOPS that a VM/VMDK can perform. The default behavior is that all workloads are treated the same and IOs are not limited. Use this rule to apply limits to less important workloads so that they do not adversely impact more important workloads. Often used to address the noisyneighbors issue to keep them from impacting performance of more important applications. See the section on IOPS limit for Object for more information. Allows an object to be provisioned even though there are not enough resources available to meet the policy. The default of No is appropriate for most production environments. If set to Yes, an object can be created even if there are not enough resources available to satisfy other policy rules. In this case, the object appears as Not Compliant. Default storage policy If a storage policy is not specified when a VM is created, the default storage policy for the datastore is applied. You can change the default policy for a datastore by selecting the datastore and, under Settings, clicking Edit and specifying a new storage policy (as show in the following figure). Figure 14. Datastore default storage policy This changes the default for any new VMs that are created on that datastore, but does not affect the policy of any VMs created previously. 29 Dell EMC VxRail Appliance Operations Guide

Defining a new VM storage policy Use vsphere web client to define a VM storage policy on the Create New VM Storage Policy screen. Figure 15. vsan storage policy 1. From the vsphere web client, navigate to Policies and Profiles > VM Storage Policies, and click on Create a new VM storage policy. Select a vcenter Server from the dropdown. Type a name and a description for the storage policy and click Next. On the Rule-Set 1 window, define the first rule set. 2. Select VSAN from the Rules based on data services drop-box. The page expands to show capabilities reported by the vsan datastore. 3. Add a rule and supply appropriate values. Make sure that the values you provide are within the range of values advertised by storage capabilities of the vsan datastore. Review the storage consumption model to understand how the rules specified impact the capacity required. (Optional) Add tag-based capabilities. (Optional) Add another rule set. 8. Click Next and review the list of datastores that match this policy. Click Finish when done. To be eligible, a datastore must satisfy at least one rule set and all rules within this set. Verify that the vsan datastore meets the requirements set in the storage policy and that it appears on the list of compatible datastores. 30 Dell EMC VxRail Appliance Operations Guide

The new policy is added to the list and can be applied to a virtual machine and its virtual disks. Assigning a storage policy to a VM When a VM is created, the default storage policy for the selected datastore is assigned to the VM (as shown in the figure below). Figure 16. New Virtual Machine dialog - Select VM Storage Changing a storage policy for a VM You can change the storage policy for a VM while the VM is online without impacting availability. Select the VM -> Manage -> Policies and click Edit VM Storage Policy. Select the policy to assign to the VM and click OK as shown in the figure below. 31 Dell EMC VxRail Appliance Operations Guide

Figure 17. Manage VM - Edit VM Storage Policy While the new policy is being applied, the compliancy status may show as Noncompliant until the storage object is reconfigured to match the new policy. Availability is not impacted while the storage object is being reconfigured. For example, if the original policy used a FTM of Mirroring and the new policy has a FTM of Erasure Coding, a new replica of the storage object is created using erasure coding before deleting the original copy that used mirroring. Therefore, the storage capacity consumption may increase while the storage is being reconfigured to comply with the new policy. 32 Dell EMC VxRail Appliance Operations Guide

Renaming VxRail objects within vcenter Default names for VxRail objects are assigned when the system is initialized. You can modify object names as needed to follow local naming conventions. Renaming a VxRail datacenter When the VxRail cluster is configured onsite, the vsphere data center is assigned a default name. Use the procedure below to change the name to conform to local naming conventions. See the SolVe Desktop procedure for VxRail available on support.emc.com for more detail. 1. Log in to the vsphere Web Client as administrator@vsphere.local. 2. Select Home > vcenter Inventory List > DataCenters. 3. Right-click VxRail Cluster s data center and select Rename. 4. Enter the new name of the data center and click OK. Renaming a VM folder When the VxRail cluster is configured onsite, the default VM folder is assigned a default name in the vcenter Server inventory. Use the procedure below to change the name to conform to local naming conventions. See the SolVe Desktop procedure for VxRail available on support.emc.com for more detail. 1. Log in to the vsphere Web Client as administrator@vsphere.local 2. Select VM and Templates. 3. Right-click Discovered with machine in internal vcenter and select Rename. In an external vcenter, the VM Folders name is VMware HCIA Folder. 4. Delete the existing name, type a new name for the virtual machine folders in the text box, and click OK. Renaming vsan datastore When the VxRail cluster is configured onsite, a default is assigned to the vsan datastore in the vcenter Server inventory. Use the procedure below to change the name to conform to local naming conventions. See the SolVe Desktop procedure for VxRail available on support.emc.com for more detail. 1. Log in vsphere Web Client as administrator@vsphere.local 2. Select Home > Inventory > Datastores. 3. Right-click the datastore and select Rename. 4. Enter the new name of the datastore, click OK and wait for the task to complete. 33 Dell EMC VxRail Appliance Operations Guide

Renaming a VxRail vsphere Distributed Switch When the VxRail cluster is configured onsite, a default name is assigned to the vsphere Distributed Switch in the vcenter Server inventory. Use the procedure below to change the name to conform to local naming conventions. See the SolVe Desktop procedure for VxRail available on support.emc.com for more detail. 1. Log in to vsphere web client as administrator@vsphere.local. 2. In vcenter Server, navigate to Home > Networking. 3. Right-click VMWARE HCI Distributed Switchxxx and click Settings > Edit Settings. 4. Enter the new name in General > Name field and click OK. 5. Verify that the vds name was changed in vcenter Server Web Client or from an ESXi console by running the following command: esxcli network vswitch dvs vmware list Renaming VxRail dvportgroup When the VxRail cluster is configured onsite, a default name is assigned to the vsphere distributed in the vcenter Server inventory. Use the procedure below to change the name to conform to local naming conventions. See the SolVe Desktop procedure for VxRail available on support.emc.com for more detail. NOTE: Changing the name of dvportgroup from the vcenter Server web client does not disconnect virtual machines connected to the dvportgroup. The updated information is pushed to the ESXi/ESX hosts connected to the dvportgroup without causing an outage. 1. Log in to vsphere Web Client as administrator@vsphere.local. 2. In vcenter Server, navigate to Home > Networking. 3. Right-click VMWARE HCI Distributed Switchxxx > dvportgroup and click Edit Settings. 4. Enter the new name in General > Name field. 5. Click OK. 6. Use the vcenter Server web client to verify that the dvportgroup name has changed. 34 Dell EMC VxRail Appliance Operations Guide

Choosing a failure tolerance method (FTM) To protect virtual machine storage object on a vsan datastore from drive failure, VxRail Appliance supports both mirroring and erasure coding failure tolerance methods (FTM). Mirroring (RAID-1) provides better performance and consumes less memory and network resources, but uses more disk space. Erasure coding (RAID5/6) uses parity-based protection and provides more usable capacity, but consumes more CPU and network resources. FTM for a virtual machine storage object is configured as a SPBM rule and works with the failures to tolerate (FTT) rule. If FTM= mirroring and FTT=1, two replicas of data are maintained. If FTM=mirroring and FTT=2, there are three replicas of data. If FTM=erasure coding and FTT=1, a RAID-5-like data protection is used. If FTM= erasure coding and FTT=2, a RAID-6-like data protection is used. The appropriate failure tolerance method depends on the workload characteristics and VxRail system configuration. Generally, heavy write-intensive workloads are better suited for mirrored FTM and read-intensive workloads work well with erasure coding. Your Dell EMC representative can help you determine the optimal failure tolerance method by modeling the characteristics of your workload against a specific VxRail system configuration. A VxRail system can have multiple SPBM rules allowing the flexibility to configure different FTM rules for different workloads that run on the same system. The following sections review how the FTM and FTT rules impact the configuration of storage objects within vsan. Mirroring Mirroring is supported with both hybrid and All-Flash VxRail models. If FTM=mirroring and FTT=1, two replicas of data are maintained. If FTM=mirroring and FTT=2, there are three replicas of data. In addition, vsan uses the concept of a witness. When determining if a component remains online after a failure, more than 50% of the components that make up a storage object must be available. Witnesses are components that contain only metadata. Their purpose is to serve as tiebreakers when determining if a quorum of components is online in the cluster. If more than 50% of the components that make up a virtual machine s storage object are available after a failure, the object remains online. If less than 50% of the components of an object are available across all the nodes in a VSAN cluster, that object is no longer available. Witness prevents split brain syndrome in a vsan Cluster. The figure below illustrates a four-node cluster and a virtual machine with a FTM=mirroring and a FTT=1. Note the two replicas and the witness. 35 Dell EMC VxRail Appliance Operations Guide

Figure 14: VM with two replicas and witness Erasure coding Erasure coding is an alternative failure tolerance method. Erasure codes provides up to 50 percent more usable capacity than RAID-1 mirroring. Erasure coding is supported on All-Flash models only. Erasure coding breaks up data into chunks and distributes them across the nodes in the vsan cluster. It provides redundancy by using parity. Data blocks are grouped in sets of n, and for each set of n data blocks, a set of p parity blocks exists. Together, these sets of (n + p) blocks make up a stripe. If a drive containing a data block fails, the surviving data blocks (n + p) is sufficient to recover the data in the stripe. In VxRail clusters, the data and parity blocks for a single stripe are placed on different ESXi hosts in a cluster, providing failure tolerance for each stripe. Stripes do not follow a one-to-one distribution model. It is not a situation where the set of n data blocks sits on one host, and the parity set sits on another. Rather, the algorithm distributes individual blocks from the parity set among the ESXi hosts in the cluster. Erasure coding provides single-parity data protection (RAID-5) that can tolerate one failure (FTT=1) and double-parity data protection (RAID-6) that can tolerate two failures (FTT=2). The figures below illustrate the implementations. A single-parity stripe uses three data blocks and one parity block (3+1), and it requires a minimum of four hosts or four fault domains to ensure availability in case one of the hosts or disks fails. It represents a 30 percent storage savings over RAID-1 mirroring. RAID-5 (FTT=1) requires a minimum of four nodes. Figure 14: Storage object with FTM=Erasure Coding FTT=1 36 Dell EMC VxRail Appliance Operations Guide

Dual parity saves as much as 50 percent capacity over RAID-1. It uses four data blocks plus two parity blocks (4+2) and requires a minimum of six nodes (as shown below). Figure 14: Storage object with FTM=Erasure Coding FTT=1 The figure below compare the usable capacity for mirroring and erasure-code fault tolerance method. As you can see erasure coding can increase usable capacity up to 50 percent compared to mirroring. Figure 14: Usable capacity savings for Erasure Coding With erasure coding, in the event of a drive failure, backend activity increases. During a rebuild operation, a single read from the VM requires multiple reads from disk and additional network traffic, since the surviving drives in a stripe must be read to calculate the data of the failed member. This additional IO is the primary reason why only all-flash VxRail configurations use erasure coding. The rationale is that the speed of flash disks compensate for the additional overhead. Note: with VxRail 4.5, rebuild rate is configurable and this activity can be throttled to minimize the impact to other workloads that run on the cluster. However by throttling resynchronization, the time that the data is exposed in the event of another drive failure increases. 37 Dell EMC VxRail Appliance Operations Guide

FTM considerations When a drive or node fails, and there is free capacity and enough available nodes, data is rebuilt to ensure compliancy with the SPBM rules. If there are not enough resources, data is still available but a subsequent failure could result in data unavailability. Therefore the best practice is to always configure the cluster with enough nodes to allow data to be rebuilt. The following table contrasts the requirements/recommendations for data protection and compares the requirements for mirroring and RAID 5/6 erasure coding for the different FTT values. Table 2. Requirements/recommendations for data protection Note the number of nodes required for compliance and the recommended number of nodes that allows data to rebuilt and maintain compliancy with the SPBM policy. 38 Dell EMC VxRail Appliance Operations Guide

Configuring IOPS limit for objects (QoS) By default, vsan allows virtual machines to perform IO operations at the rate requested by the application. The number of IOs is only limited by the system configuration. All workloads get the same IO access, without regard to the importance of the workload. As a result, the performance of more important workloads can be adversely impacted because of the amount of resources consumed by the less important workloads. This impact is sometimes referred to as noisy neighbors. An example is an analysis and reporting application that drives a heavy IO workload twice a day and, when it does, impacts an OLTP application. While the analysis and reporting application may be important to the business, it is not as timesensitive as the customer-facing OLTP application. Limiting the IOPS available to the analysis and reporting application would eliminate its adverse impact on the OLTP application. The vsan SPBM IOPS limit for objects rule limits the impact of noisy neighbors on more important workloads. This rule establishes Quality of Service (QoS) for an object by defining an upper limit on the number IOPS a VM/VMDK can perform. When an object exceeds the IOP limit, IO operations to that object are throttled. The algorithms used by vsan intelligently handle bursts of activity and allow an object to double its IOP-limit rate for the first second after a period of inactivity. By default, the IOPS limit for an object is set to zero and IOPS limits are not enforced. Configuring IOPS limit for a VM To configure IOPS limits for an object: First, create a new SPBM policy with the IOPS limit for object rule, or modify an existing SPBM policy to add the IOPS limit for object rule. Assign the SPBM policy to the VM. Procedure: Adding a new rule The following procedure describes how to create a new SPBM Policy with the IOPS limit for object rule by cloning the default storage policy and adding a new rule. 1. From the vsphere web client, navigate to Policies and Profiles > VM Storage Policies, select the Virtual SAN Default Storage Policy, right-click and select Clone. 2. In the Clone VM Storage Policy dialog, assign the new policy a name and description. 3. In the Clone VM Storage Policy dialog, click on Rule-Set1 and Add Rule. Select the IOPS limit for object rule. The screenshot below is an example of this dialog. 39 Dell EMC VxRail Appliance Operations Guide

4. Specify the value for the IOP limit. 5. Click OK to save the new SPBM policy. Procedure: Editing a new rule The following procedure describes how to add an IOPS limit for object rule by editing an existing policy. This can be completed while the VM is online. 1. Within vcenter, select the VM in the navigation tree in the Host and Clusters. 2. On the Manage -> Policies tab, click Edit VM Storage Policy. 3. Click Edit VM Storage Policy and select the policy with the IOPS limit for object rule configured. The screenshot below is an example of this dialog. 40 Dell EMC VxRail Appliance Operations Guide

vsan fault domains vsan uses fault domains to configure tolerance for rack and site failures. By default, a node is considered a fault domain. vsan spreads storage object components across fault domains. Therefore, by default, vsan spreads storage object components across nodes. Consider an 8-node cluster spread across four server racks, with two nodes in each rack. If the FTT is set to 1 and fault domains are not configured, vsan might store both replicas of an object on hosts in the same rack. If so, applications are exposed to rack-level failure. By defining the nodes in each rack as a separate fault domain, the system can be protected from rack-level failures. The figure below shows an 8-node cluster spread across four physical racks with four separate fault domains: Fault domain 1 (FD1) contains esxi-01 & esxi-02 Fault domain 2 (FD2) contains esxi-03 & esxi-04 Fault domain 3 (FD3) contains esxi-05 & esxi-06 Fault domain 4 (FD4) contains esxi-07 & esxi-08 In this scenario, vsan ensures that storage object components are stored on nodes in different racks. Figure 18. vsan Fault Domains When fault domains are configured, vsan applies the storage policy to the entire domain, instead of the individual hosts. vsan adjusts the placement of storage object to make them compliant with the storage policy. 41 Dell EMC VxRail Appliance Operations Guide

The following summarizes the recommendations and considerations for fault domains with VxRail: To allow for complete chassis failure, all nodes in the same VxRail chassis should belong to the same fault domains. Nodes requirement vary based on failure to tolerate SPBM rule and fault domains configured. The best practice is to configure each fault domain with the same number of nodes to ensure that sufficient resources are available to provide full coverage in the event of a failure. Configuring fault domains 1. Within vcenter, navigate to the Host and Cluster view. 2. Select the VxRail Cluster and in the Manage tab select Fault Domains & Stretch Clusters. The screenshot below is an example of this dialog. 3. Click on the green + to create a new fault domain. 4. Specify a name for the fault domain, select the nodes that belong to it, and click OK. 42 Dell EMC VxRail Appliance Operations Guide

VxRail deduplication and compression VxRail All-Flash configurations offer advanced data services including deduplication and compression. For many environments, data deduplication and compression can significantly increase usable storage capacity. When configured on VxRail, vsan deduplication and compression occurs inline when data is de-staged from the cache tier to the capacity tier. Data is first de-duplicated by removing redundant copies of blocks that contain the exact same data and then compressed before writing the data to disk. While all VMs are unique, they often share some amount of common data. Rather than saving multiple copies of the same data, identical blocks are saved once, and references to the unique blocks are tracked using metadata maintained in the capacity tier. The figure below shows how data is deduplication inline before it written to the capacity drives on VxRail. Figure 19. Inline data deduplication The de-duplication algorithm is applied at the disk-group level and results in a single copy of each unique 4K block per disk group. While duplicated data blocks may exist across multiple disk groups, limiting the de-duplication domain to a disk group does not require a global lookup table. This minimizes network overhead and CPU utilization, making VxRail deduplication very efficient. LZ4 compression is applied after the blocks are deduplicated and before being written to SSD. If the compression results in a block size of 2KB or less, the compressed version of the block is persistently saved on SSD. Otherwise, the full 4KB block is written to SSD. 43 Dell EMC VxRail Appliance Operations Guide

While almost all data sets benefit from deduplication, typical virtual-server workloads with highly redundant data such as full-clone virtual desktops or homogenous-server operating systems benefit most. Compression provides further data reduction. Text, bitmap, and program files are very compressible, and a 2:1 compression ratio is common. Other data types that are already compressed, such as some graphics formats and video files or encrypted files yield little or no data reduction through compression. VMs that are encrypted using vsphere encryption benefit little from vsan deduplication and compression. Considerations While the VxRail deduplication method is very efficient, some CPU resources are used to compute the segment fingerprints or hash keys, and additional IO operations are needed to perform lookups on the segment index tables. vsan computes the fingerprints and looks for duplicated segments only when the data is being de-staged from the cache to the capacity tier. Under normal operations, VM writes to the write buffer in the cache SSD should incur no latency impact. Environments that benefit most for deduplication are read intensive environments with highly compressible data. Use the figure below to determine the value of deduplication for an application environment. Figure 20. Value of deduplication for specific application environments Consult with your Dell EMC or VMware VxRail specialist, who can model your workload against a specific system configuration to help you decide if the benefit of deduplication offsets the resource requirements for your application workload. Enabling deduplication and compression Deduplication and compression are disabled by default and are enabled together at the cluster level. While deduplication and compression can be enabled at any time, enabling it when the system is initially setup is recommended to avoid the overhead and potential performance impact of having to deduplicate and compress existing data through post processing. 44 Dell EMC VxRail Appliance Operations Guide

Procedure This is an online operation and does not require virtual machine migration or DRS. The time required for this operation depends on the number of hosts in the cluster and the amount of data. You can monitor the progress on the Tasks and Events tab. If the system uses deduplication and compression, the best practice is to enable it at the time the system is initially set up. 1. Navigate to the Virtual SAN host cluster in the vsphere Web Client. Click the Configure tab. 2. Under vsan, select General. 3. In the vsan is turned ON pane, click the Edit button. 4. Configure deduplication and compression. The figure below shows the vsan setting for enabling compression. Figure 21. Deduplication and compression enabled 5. Set deduplication and compression to Enabled. 6. Click OK to save your configuration changes. When deduplication and compression is enabled, vsan changes disk format on each disk group in the cluster. To accomplish this change, vsan evacuates data from the disk group, removes the disk group, and recreates it with a new format that supports deduplication and compression. The best practice is to enable deduplication and compression immediately after the system is initialized. If it is enabled on a production system, it is recommended that this be done during periods of low activity to minimize potential performance impact. 45 Dell EMC VxRail Appliance Operations Guide

Disabling deduplication and compression Deduplication and compression are typically enabled for the cluster at the time of install and remains enabled. If you need to disable this data services, consider the following: When deduplication and compression are disabled, the size of the used capacity in the cluster expands based on the deduplication ratio. Before you disable deduplication and compression, verify that the cluster has enough capacity to handle the size of the expanded data. When disabling deduplication and compression, vsan changes disk format on each disk group of the cluster. To accomplish this change, vsan evacuates data from the disk group, removes the disk group, and recreates it without deduplication and compression. This adds workload overhead to the system. The time required for this operation depends on the number of hosts in the cluster and amount of data. You can monitor the progress on the vcenter Tasks and Events tab. Procedure 1. Navigate to the vsan host cluster in the vsphere Web Client and click the Configure tab. 2. Under vsan, select General. 3. In the vsan is turned ON pane, click the Edit button. 4. Set the disk claiming mode to Manual. 5. Set deduplication and compression to Disabled. 6. Click OK to save your configuration changes. Monitoring deduplication and compression You can monitor the efficiency of deduplication and compression from the storage view within the vcenter web client. This view shows the total capacity used both before and after compression and after compression as well as a breakdown of how the used capacity is consumed. 46 Dell EMC VxRail Appliance Operations Guide

An example of this view is shown below. Figure 22. Monitoring datastore showing deduplication and compression efficiency 47 Dell EMC VxRail Appliance Operations Guide

vsan encryption services VxRail 4.5 offers native security with data-at-rest encryption built into vsan. Data-at-rest encryption protects all objects on a vsan cluster by encrypting the entire datastore. Encryption is performed in the hypervisor, does not require specialized drives or other hardware, and can be optionally enabled on new or existing VxRail 4.5 or greater clusters. Encryption is completely transparent to the virtual machines and fully compatible with all vsan and vsphere features. Because encryption is performed just before the data is written to disk, deduplication and compression space efficiency benefits are not impacted. The encryption process is efficient and takes advantage of the Advanced Encryption Standard New Instructions (AES-NI) that is enabled by default on all VxRail 4.5 nodes. Key management is accomplished using customer-provided KMIP 1.1 compliant key management technologies. The following summarizes VxRail 4.5 data encryption features: Data-at-rest encryption at the datastore level with both cache and capacity drives encrypted AES-256 encryption and simplified key management Fully compatible with VxRail data efficiency options, including deduplication/compression and all protection types Fully compatible with vsphere features such as vmotion, HA, and DRS No special hardware required since encryption is software based Efficient and, in a properly sized cluster, no performance impact and estimated 5-15% CPU overhead Can be optionally enabled on new or existing VxRail 4.5 clusters vsan Enterprise licensing is required to enable encryption on a VxRail cluster Considerations VMware offers two encryption methods for VxRail. VMs can be encrypted using vsphere encryption, or the entire cluster can be encrypted using vsan encryption. Only one encryption method can be used on a cluster. The appropriate method depends on protection concerns and its impact on deduplication and compression. vsan encryption provides protection for data at rest and is effective in addressing concerns with media theft. Because data is encrypted after deduplication and compression, it gets the full benefit of these data services. vsphere VM-level encryption protects data while it is in motion (over the wire) and is designed to provide protection from a rogue administrator. With VM-level encryption, the data is encrypted before being stored on the vsan datastore. Encrypted data typically does not benefit from deduplication and compression, therefore VM-level encryption benefits little from vsan deduplication and compression. 48 Dell EMC VxRail Appliance Operations Guide

Encryption is only supported on VxRail clusters that use customer-provided vcenter Servers. VxRail Manager-deployed vcenters do not support encryption. A customer-provided KMIP 1.1 compliant key management technology is also required. For a full list of supported Key Management Services, visit https://www.vmware.com. vsan encryption is very efficient. In a properly sized cluster, there should be no performance impact. If you are considering whether to enable encryption on an existing VxRail appliance, work with your Dell EMC representative, who can model the impact of encryption on CPU overhead (estimated to be only 5-15%) for your configuration. Setting up vsan encryption on a VxRail cluster If a system uses data-at-rest-encryption capabilities of VxRail 4.5, it is a best practice to enable encryption immediately after the system is initialized. While it is possible to enable it at any time and this can be done online, all existing data on the vsan datastore must be reformatted before the data is protected. Enabling encryption at the time the system is initialized minimizes the overhead and time it takes to protect the system. Note: Encryption is only supported in VxRail environments that are configured with customer deployed vcenter servers. Setting up vsan encryption on a VxRail cluster involves the following two steps: First, configure a Key Management Service (KMS).and then establish a domain of trust between the KMS that generates the keys, vcenter, and the ESXi hosts that encrypt the data. The domain of trust follows the standard Public Key Infrastructure (PKI)-based management of digital certificates. For instructions on how to setup a key management server, see the vendor documentation. After establishing domain of trust, enable encryption in vsphere and begin the automated process of reformatting the disk and encrypting existing data. Setting up a domain of trust Use the following procedure to set up the domain of trust between the KMS and the vcenter Server. 1. Within the vcenter web client, select the vcenter instance, Configure, select Key Management Servers and click on the green plus sign (+). Specify the name of the KMS cluster, the IP address and a port to use. 49 Dell EMC VxRail Appliance Operations Guide

An example of this dialog is shown below. Figure 23. Configure KMS The options used depend on the KMS and local security policies. In the example dialog below, Root CA Certificates are used. Figure 24. Select certificate type 50 Dell EMC VxRail Appliance Operations Guide

2. Verify that the KMS connection state is normal (as shown below). Figure 25. Verify KMS connection state Enabling encryption for the VxRail cluster Use the following procedure to enable encryption for a VxRail cluster. Prerequisites Domain of trust is set up. Procedure 1. Using the vsphere web client, navigate to the vsan host cluster and click Configure. 2. Under vsan, select General. 3. In the vsan pane, click Edit. 4. On the Edit vsan settings dialog, check Encryption and select a KMS cluster. Click OK. 51 Dell EMC VxRail Appliance Operations Guide

The figure below is an example of this dialog. Figure 26. Enabling encryption Two options are available when enabling or disabling encryption: Erase disks before use. Check this box to perform a soft erase of the disk before writing new data. This option can lengthen the time it takes to complete the encryption process. Allow reduced redundancy. vsan relaxes the data protection rules and allows a reduced number of replicas during the DFC process. VxRail clusters with three nodes may require Reduced Redundancy, since there may not be enough headroom to move data during the conversion process. When encryption is turned on, the VxRail cluster performs a disk format change (DFC). The DFC creates a new partition on the disk that holds the metadata information and prepares the disk to encrypt all write operations. The automated DFC process for a cluster is performed one disk group at a time. When encryption is enabled, vsan detects that a disk has never been encrypted and initiates the DFC operation. Existing data is moved out of the disk group and then written back to the encrypted drive. To complete this operation there must be spare capacity in the cluster. 52 Dell EMC VxRail Appliance Operations Guide

Connectivity to key management server The key management server (KMS) is responsible for generating and storing the primary key used to encrypt data. While continuous connectivity to the KMS server is not required, connectivity to the KMS server is required when: Enabling of disabling encryption Performing key management activities such as rekeying Replacing an encrypted drive Rebooting an ESXi host while encryption is enabled Adding a node to the VxRail cluster while encryption is enabled 53 Dell EMC VxRail Appliance Operations Guide

VxRail networking Within a VxRail cluster, multiple networks serve different functions and traffic types. It is important to understand both the physical and logical networks that make up a VxRail environment and their impact on services and hosted applications. The physical switches that connect the VxRail appliance to the network are configured by the customer prior to installing the VxRail Appliance. The network details are documented during preplanning and used when initializing the system at time of install. As part of an automated procedure, the physical network configuration is verified and the network interfaces and vsphere virtual distributed switch are configured following proven best practices. Changes to the network configuration post-installation should be performed with extreme caution. Physical network Physical network considerations for VxRail are no different from those of any enterprise IT infrastructure: availability, performance, and extensibility. Generally, VxRail appliances are delivered ready to deploy and attach to any 10GbE network infrastructure. A 10GbE dualswitch, top-of-the-rack (ToR) network configuration is recommended for most environments. The topology should be designed to eliminate all single points of failure at the connection level, the uplink level, and within the switch itself. The figure below shows typical network connectivity using two switches for redundancy. Singleswitch implementations are also supported. Figure 27. VxRail physical network 54 Dell EMC VxRail Appliance Operations Guide

1GbE option A 1GbE switch option is available for smaller, less demanding workloads. Since lowerbandwidth can affect performance and scale, the 1GbE option is only supported for hybridstorage configurations and is limited to a maximum of eight nodes and single-socket CPUs. With 1GbE, a minimum of four NICs are required per node, increasing the number of switch ports required. VxRail VLANs A virtual distributed switch (VDS) connects the physical switch ports to the logical components in a VxRail Appliance. The VDS is configured as part of the system initialization. Port groups are created spanning all nodes in the cluster. Network traffic is isolated at the port-group level using switch-based VLAN technology and vsphere Network IO Control (NetIOC). Network traffic is segregated using switch-based VLAN technology with a recommendation of a minimum of four VLANs for the four types of network traffic in a VxRail cluster: Management. Management traffic is use for connecting to VMware vcenter web client, VxRail Manager, and other management interfaces and for communications between the management components and the ESXi nodes in the cluster. Either the default VLAN or a specific management VLAN is used for management traffic. vsan. Data access for read and write activity as well as for optimization and data rebuild is performed over the vsan network. Low network latency is critical for this traffic and a specific VLAN is required to isolate this traffic. vmotion. VMware vmotion allows virtual-machine mobility between nodes. A separate VLAN is used to isolate this traffic. Virtual Machine. Users access virtual machines and the services provided over the VM networks. At least one VM VLAN is configured when the system is initially configured, and others may be defined as required. The tables below show how the NIOCs are configured for VxRail. Do not modify these values since they have been set for optimum availability and performance. Table 3. VxRail traffic on 10GbE NICs 55 Dell EMC VxRail Appliance Operations Guide

Table 4. VxRail traffic on 1GbE NICs Enabling multicast By default, a switch in a VxRail network floods multicast traffic in the broadcast domain or VLAN. To mitigate this problem, vsphere services use IPv4 multicast (IGMP Querying/Snooping) to prune multicast traffic so that only the nodes that need the traffic receive the traffic, thereby improving performance. Routers are registered to receive specific multicast traffic only. IPv4 also responds to topologychange notifications. Without IGMP Querying/Snooping, multicast traffic is treated like a broadcast transmission, which forwards packets to all ports on the network. IPv6 multicast functions similarly, sending multicast traffic only to multicast members. IPv6 multicast needs to be enabled on all ports used by the VxRail nodes. IPv4 multicast is required for vsan. For larger environments that span multiple switches, it is important that IPv4 and IPv6 multicast traffic is passed between them. Beginning with VxRail 4.5 and vsan 6.6, vsan uses Unicast rather than multicast. Virtual machine and optional data services networks When the system is installed and initialized, a minimum of one VLAN is configured for guest virtual machines. Additional VLANS can be defined as required. Additional VLANs may be required to support optional data services such as RP4VM and CloudArray. Consult the product documentations for details. Optional NIC configurations Some VxRail models may be configured with optional additional PCI NIC cards. While these are not managed by VxRail Manager, you can configure them using vsphere. These optional NICs are used exclusively for VM and application traffic. The kernel ports used by vsan should not be configured to use these NIC ports. Changing network configuration During initial install and configuration, VLAN ID and IP addresses are configured based on the network planning checklist. Consult with Dell EMC Professional services if you need to change VLAN IDs or IP addresses of the ESXi hosts and data services. 56 Dell EMC VxRail Appliance Operations Guide

Migrating workloads onto VxRail The best strategies for moving compute and storage workloads onto VxRail depends on the environment and workload availability requirements. Different workloads often require different migration strategies. The same strategies can be applied when migrating workloads between VxRail environments for workload redistribution. Considerations When considering migration strategies, data and storage are often the first topic that comes to mind. There are, however, other considerations as well. If the workload can be taken offline for a period of time (cold migration), a simple shutdown, backup and restore strategy work fines. If minimal downtime is required (warm migration), use tools like RecoverPoint for VM to replicate the workload and then perform a minimally disruptive switchover. If the workload cannot be taken offline (hot migration), use vsphere vmotion to keep the workload online during the migration. While this document does not contain an exhaustive list of considerations, take note of the following: Physical and logical network connectivity. VLANS may need to be configured on the physical switches connecting the target VxRail, and port groups for the application environment may need to be configured on the virtual distributed switch (VDS) in vcenter. Access to network services and related applications. Consider all the services that are used by the application virtual machines including DNS and NTP, and how these may change in the new environment. If there are dependencies on other applications or services, how the migration impacts these should be considered as well. Snapshots will need to be configured on the target VxRail. If the snapshots are used on the source, you must reconsolidate and make the snapshots persistent. Backup processes and procedure. VxRail supports most backup technologies, but some reconfiguration may be necessary after the migration. As part of migration planning, reconsider and update current strategies. Using vsphere vmotion With vsphere Storage vmotion, a virtual machine and all its associated disk files can be moved from one datastore to another while the VM is up and running. Using vsphere Storage vmotion to move workloads onto a VxRail environment can be done while the workload is online. Consider the following vmotion requirements when planning your migration strategy: Both the source and target vsphere clusters must be managed by the same vcenter instance. Only VxRail clusters with external vcenter Servers can leverage this approach. 57 Dell EMC VxRail Appliance Operations Guide

The storage for the source cluster must be accessible by the hosts in the VxRail cluster. Because VxRail nodes do not support fibre channel connectivity, only iscsi or NFS can be used. Enhanced vmotion Compatibility (EVC) is enabled on the VxRail cluster by default and allows VMs developed for different generations of processors to be migrated onto the VxRail cluster. Check VMware documentation for more information. The process for migrating the virtual machine and storage using vsphere vmotion is simple: Select one or more VMs in the vsphere web client, right-click the selection and click Migrate. The figure below shows the vmotion dialog. When moving an application onto a VxRail, select Change both compute resources and storage. Figure 28. Migration using vmotion Change both compute and storage Additional dialogs are displayed as the target cluster is selected and verified. vsphere copies the files VMX, NVRAM, VMDKs and so on from the source storage to the VxRail vsan datastore. Potentially large amounts of data need to be copied. How long it takes to complete the migration is determined by the size of the dataset and the speed of the storage. In most cases, it is best to perform the migration a few VMs at a time. Using storage vmotion does not require any downtime and the virtual machines stay online as they are being migrated. VMware vsphere Replicator Using host-based replication does not require the vsan cluster to have access to the existing storage, making this a good strategy when the source vsphere environment uses Fibre Chanel SAN storage. For host-based replication, vsphere Replication can be deployed on the target VxRail cluster as a virtual appliance. Replication can be configured on the source VM. One or more VMs can be configured and replicated together. Once the data has been replicated to the target VxRail cluster, vsphere Replication is used to recover the VMs on the VxRail cluster. This is 58 Dell EMC VxRail Appliance Operations Guide

considered a warm migration since the recovery requires restarting the VM on the VxRail Cluster. This can be scheduled during a maintenance window. Dell EMC RecoverPoint for VM RecoverPoint for VM can be used to perform workload migration onto a VxRail cluster. RecoverPoint for Virtual Machines is included with VxRail, and can be downloaded directly from the VxRail Manager marketplace. Migrating VMs using RecoverPoint for VM is similar to using vsphere Replicator. This is considered a warm migration where, after replication is set up and the data has been replicated, the VM and its data are recovered on the VxRail Cluster. The source and target do not have to be part of the same vcenter, and the target does not need to have access to the source data store. An example of a RecoverPoint for VM environment is shown below. Figure 29. Migration for source vsphere environment and recovery on VxRail Once migration is complete, you can continue to use RecoverPoint for VM as part of your disaster recovery strategy. RecoverPoint for VM supports flexible configuration options including local and remote protection, and asynchronous and continuous replication. Connecting iscsi storage to VxRail While VxRail does not support Fibre Channel connectivity, iscsi can be used to connect external block storage to VxRail and for mobility between VxRail environments. Data on iscsi storage is easily moved into the VxRail VSAN environment or between VxRail environments. iscsi storage can be provided from Dell EMC Unity or any supported iscsi array. Dell EMC Cloud Array can be used to present iscsi Storage to both VxRail and external clients. 59 Dell EMC VxRail Appliance Operations Guide

The figure below shows a VxRail environment that includes iscsi storage in addition to the VSAN datastore. Figure 30. iscsi provides data mobility into and between VxRail environments iscsi is a standard part of a vsphere environment. A software adapter using the NIC on an ESXi host is configured as an initiator, and targets on an external storage system present LUNs to the initiators. The external LUNs are typically used as VMFS datastores. iscsi configuration is performed using the vsphere web client. Procedure 1. Create a port group on the Distributed Virtual Switch 2. Create a VMkernal Network Adapter and associate it with the port group and assign an IP address. 3. From the vcenter Manage Storage Adapters view, use the Add iscsi Software Adapter dialog to create the software adapter. This step binds the iscsi software adapter with the VMkernel adapter. Once this is complete, iscsi targets and LUNs can be discovered, used to create new datastores and mapped to the hosts in the cluster. Refer to VMware documentation for more details. Connecting NFS storage to VxRail NFS can be configured on VxRail to provide connectivity to external storage. NFS presents filelevel storage seen as a datastore. iscsi presents block-level storage seen as a LUN. The following figure show a NFS file system that has been exported from a network attached server and mounted by the ESXi nodes in a VxRail environment. This enables data mobility into and between VxRail environments, as well as providing additional storage capacity. 60 Dell EMC VxRail Appliance Operations Guide

Figure 31. NFS provides data mobility into and between VxRail environments Using vmotion, storage object can be easily moved between the NFS filesystem and the VSAN datastore. Procedure NFS is a standard vsphere feature and is configured using the vcenter web client. 1. In the Hosts and Clusters view under Related Objects and the New Datastore dialog, select NFS as the datastore type. 2. Specify the NFS version, name of the datastore, IP address or hostname of the NFS server that exported the filesystem, and the hosts that will mount it. The NFS filesystem will appear like the vsan datastore. Using Dell EMC CloudArray Dell EMC CloudArray is deployed on a VxRail appliance as a virtual appliance directly from the VxRail Manager Market. A 1TB starter license is included with VxRail. CloudArray uses local vsan storage as a buffer to eliminate the latency normally associated with cloud storage. It presents either iscsi LUNS or NFS/CIFS NAS storage, either to external clients or back to the VxRail cluster and used as a datastore. While CloudArray is normally backed by public or private cloud storage, this is not necessary. CloudArray can be used to simply present NFS or iscsi storage without connecting to a cloud. For small environments, this could provide an easy means to migrate storage objects into a VxRail environment. For more information on using Dell EMC CloudArray with VxRail, seerefer to the whitepaper available on https://www.emc.com. Conclusion Different options for migrating virtual machine and associated data onto a VxRail cluster are available. Options include backup and restore, replication where the VM is replicated and quickly restarted, and online migration using vsphere vmotion. Keep in mind there may be considerations other than the VMs and data that need to be part of the planning process. 61 Dell EMC VxRail Appliance Operations Guide

Resource utilization and performance monitoring Capacity management for VxRail involves monitoring all system resources including compute, memory and storage utilization. Proactive monitoring enables you to better understand the system current workload and assess the ability to handle additional workloads. Compute and memory capacity management involves monitoring utilization. Storage capacity planning has multiple dimensions. Capacity is both the ability to persistently store GB/TB of data and the system s ability to process the IO workload. The IO workload is measured by IO operations per second (IOPS), throughput (MB/sec), and latency (response time) measured in milliseconds. The capacity and workload capabilities of a VxRail system are directly related to the number of nodes and the configuration of each node. With the range of models and the flexible configuration options available, VxRail systems are available to meet the requirements of a range of workloads. Compute capacity for a system is determined by the number of processors, the number of cores in each processor and processor speed, and available memory sizes ranging from 64GB to 1536GB per node. IO capacity is related to the number of drives, the drive technology, data protection techniques used and the data services that are enabled. The workload characteristics also significantly impact capacity and system capabilities. IOPs and throughput directly relate to IO block size and the size of the active working data set. Application concurrency impacts memory and compute efficiencies. When a VxRail system is initially sized, all requirements including the expected workload are modeled to determine an optimal configuration. With the scale-out design, as requirements change, resources can be added to the system by either adding additional drives to existing nodes or adding additional nodes to the cluster. This provides the flexibility to configure only the resources needed in the near future with the confidence that the system can seamlessly expand as requirements grow. Monitoring system capacity, workload activity, and performance Workload and performance monitoring is best when performed as an ongoing process rather than as a task performed when there is a problem. Systems are seldom configured exactly the same and no two systems have the same workloads. Therefore, no two systems perform exactly the same. Understanding what is normal for a specific environment makes it easier to recognize when there is a risk of not meeting Service Level Agreements (SLA s), and provides a baseline when investigating potential issues. VxRail Appliance provides the following monitoring tools: vsphere vcenter includes fully integrated performance and workload monitoring services. Using the vsphere web client, performance charts for system resources, including CPU, 62 Dell EMC VxRail Appliance Operations Guide

memory, storage, and more provide detailed information on workload activity and capacity consumption. The vsphere web client also reports capacity of the vsan datastore, deduplication and compression efficiency, and a breakdown of capacity usage. VxRail Manager includes dashboard-level appliance monitoring. Optional tools such as vrealize Operations can also be used to monitor a VxRail environment. VxRail Manager Logical Health view Monitoring begins by understanding the overall health of the system. The Health, Logical view within VxRail Manager provides a high-level view of system capacity and workload activity for each node individually or for the entire cluster as shown in the figure below. Figure 32. VxRail Manager HEALTH- Logical view showing cluster level capacity consumption In this view, resource consumption is color coded. Alerts are generated when CPU and memory consumption exceeds 70%. For capacity planning purposes, observe both the cluster totals and each individual node. If the consumption for individual nodes exceeds 70%, but the cluster average is below 70%, using vmotion and/or DRS may help in rebalance the workload across nodes in the cluster. The Logical view displays total storage capacity and used capacity. The first consideration is to have enough available capacity for near-term application requirements. As a general practice, maintaining at least 20% free capacity provides optimal performance. When disk capacity utilization exceeds 80% on any capacity drive, vsan invokes an automatic rebalancing process. This increases the backend workload that could potentially impact application performance. Maintaining at least 20% available capacity eliminates the need for this rebalancing. This additional capacity is sometimes referred to as slack space. It is also important to maintain enough reserve capacity so that full workload availability is maintained while handling planned and unplanned disk or node events. Consider a four-node cluster. If one node were taken down for maintenance, the cluster would lose 25% of its 63 Dell EMC VxRail Appliance Operations Guide

capacity. If storage utilization were at 80%, there would not be enough available capacity to maintain compliance with the Failure Tolerance Method (FTM) SPBM rules. In addition to the 20% reserve capacity for vsan slack space, a best practice is to maintain at least one node s worth of capacity for high availability during planned and unplanned events. Monitoring storage capacity using vcenter web client vcenter web client displays capacity details about the vsan datastore, including deduplication and compression space efficiencies if enabled. The figure below shows the vcenter Storage view for the vsan datastore. Figure 33. vcenter Storage, Monitor, Virtual SAN Note the Deduplication and Compression savings and ratio. Deduplication and compression savings vary but generally most datasets experience between 1.5:1 and 2:1 capacity savings. For longer-term capacity planning and analysis, the performance view below shows storage consumption trends. The dropdown allows you to select a time range of up to one year. 64 Dell EMC VxRail Appliance Operations Guide

Figure 34. vcenter Storage, Monitor, Performance Understanding consumption trends helps predict when more storage capacity will be required. You can add capacity to a VxRail cluster by either adding drives to existing nodes (if drive slots are available) or by adding nodes to a cluster. Using native vsphere performance monitoring vcenter includes monitoring of CPU, memory, and vsan IO activity. You can view performance statistics directly from the vsphere web client Performance > Monitor view when a cluster, host or VM is selected in the vcenter server inventory. The specific metric available varies depending on the object being monitored. Viewing CPU and memory performance metrics Compute and memory are critical to application performance. Capacity management balances between the risk of degraded performance caused by insufficient resources and the inefficiency of underutilizing available compute and memory. You can view memory and CPU consumption at the cluster, host, and virtual machine level. Use cluster-level metrics to understand overall utilization for capacity planning. Use this view to understand normal utilization and identify trends that can help determine when it is appropriate to add additional resources. Use host level metrics to determine workload balance and for more detailed troubleshooting. Use VM-level information to understand the characteristics of an application workload. The figure below is an example of the CPU and memory utilization for a cluster. In this example, the Overview level of detail is displayed. The Advanced view shows a more granular view of utilization. 65 Dell EMC VxRail Appliance Operations Guide

Figure 35. vcenter Cluster, Monitor, Performance Overview CPU consumption is reported in MHz. The total MHz available is calculated by multiplying the number of CPU cores by the speed of the processors. Consumption is often bursty. As a general rule, CPU utilization should be under 70% to handle workload spikes. When CPU requirements exceed available resources, application performance may be impacted. Monitor CPU consumption at both the cluster and host levels. Scenarios where host-level CPU utilization is high but overall cluster level utilization is low may benefit by using vmotion and/or DRS to rebalance the compute workload. Memory consumption varies. Advanced vsphere ESXi memory management makes it possible to assign more memory to VMs than is physically available in the hosts. When a VM is configured, the memory size is specified. The full memory space, however, is not actually allocated. The ESXi hypervisor only allocates the amount of memory that is used, and a VM does not normally need its full allocation of memory at all times. For example, a VM allocated with 4GB might only need the full 4GB of memory for 15 minutes a day; otherwise it may only use.5gb. The ESXi hypervisor allocates.5gb of memory to the VM, increases it to 4GB only when needed, and then reclaims memory afterward for use elsewhere. In addition to memory allocation on demand, vsphere uses other memory-management techniques, including page sharing, memory ballooning, and compression. Memory capacity management balances the risk of running out of memory and the potential performance impact with the inefficiency of underutilizing the memory configured. Because of the bursty nature of memory utilization, memory utilization should average less than 70%. More details about resource consumption are available in the Advanced level view. In the navigation tree, select the object to monitor, the chart options, and the view of interest. The Chart Options allows you to specify the timespan and other options. Note that a shorter timespan exposes details that are lost when the data is averaged over longer timespans. The example below shows the CPU utilization for the selected ESXi host in real-time. 66 Dell EMC VxRail Appliance Operations Guide

Figure 36 vcenter ESXi host, Monitor, Performance Advanced Enabling vsan performance data collection vsan performance data collection services are disabled by default, and must be enabled before vsan IO workload activity can be displayed. When enabled, it collects and maintains vsan performance statistics for both real-time and historical analysis. Enable vsan performance data collection services from the cluster view by selecting Configure -> Health and Performance. If the service is turned off, click Edit Setting to enable it. Historical performance data is stored on the vsan datastore as an object and, like any other storage object and storage policy, must be specified when enabling Virtual SAN Performance Services. Note: Turning off Performance Services erases all existing performance data. In the figure below, the vsan Performance Services is turned on and uses the Default Storage Policy for the repository that contains the performance data. 67 Dell EMC VxRail Appliance Operations Guide

Figure 37: Enabling Virtual SAN Performance Service for vsan cluster Once the service has been enabled, vsan performance data is collected. You can display graphs can in the Performance > Monitor view when a cluster, host or VM is selected in the vcenter server inventory. Viewing vsan performance metrics vsan performance metrics are available at the cluster, host, or virtual machine level. Activity is shown from a front-end or backend prospective. Virtual SAN - Virtual Machine Consumption shows front-end IO activity of the virtual machines. Virtual SAN - Backend shows activity for vsan to physical disk. To illustrate the two different views, consider a virtual machine with a SPBM Failure Tolerance Method of mirroring (RAID-1) that performs 200 write per second. The front-end shows 200 write IOPS, while the backend displays 400 write IOPS because (with mirroring) there are two replicas of the data. A top-down approach is typically used when monitoring and analyzing systems, starting at the cluster level and working down through the ESXi hosts to the individual virtual machines. Use cluster-level metrics on the front-end to monitor the overall system health and for capacity management and planning. Use these metrics to understand the normal workload of the system and to identify trends that can help determine when it is appropriate to add additional resources. Host-level metrics includes additional data at the disk group and physical disk level. These metrics are useful in understanding workload balance and for more detailed troubleshooting. VM level information is used to understand the normal behavior of an application and for troubleshooting. The figure below shows vsan cluster level metrics for front-end activity. 68 Dell EMC VxRail Appliance Operations Guide

Figure 38. vsan cluster, Monitor -> Performance -> Virtual SAN Virtual Machine Consumption The specific IOPS and throughput values are indicators of the system workload and provide little insight by themselves. Instead, monitor values over time to understand what is normal for a system and to identify trends that may indicate a need for additional resources. Understanding vsan performance metrics vsan Performance Services have an extensive set of metrics that provide top down insight from the cluster and host level down to individual virtual machines. See https://kb.vmware.com/ for a complete list of metrics. The table below describes key Virtual SAN - Virtual Machine Consumption front-end metrics and provides guidelines on how to interpret them. These metrics show IO from the client side of vsan and represent reads and writes from a VM prospective. For capacity planning, monitor these metrics at the cluster and host level to understand normal workloads and to detect abnormalities. Table 5. vsan cluster level front-end workload metrics Front-end Metric IOPS What it means Measure of input/output operations per second consumed by all vsan clients. Read/write activity is maintained and reported separately. Size of IO operations vary from a few bytes to a few megabytes. How to interpret the data IOPS for a system is a function of the workload characteristics and the system configuration, including number of nodes, number of drives, data protection type, and other data services. High or low values are neither good nor bad, but indicate relative system utilization. Use this metric to understand normal workload activity and to identify when workloads deviate from normal. 69 Dell EMC VxRail Appliance Operations Guide

Throughput Latency Congestion Outstanding IO Measure of the data rate calculated as block size times IOPS. Throughput and bandwidth are often used interchangeably. Read/write activity is maintained and reported separately. Latency is a measure of how long an IO operation takes to complete. Read/write latency is maintained and reported separately. vsan congestion occurs when the IO rate on the backend (lower layers) cannot keep up with the IO rate on the front-end. For more information on congestion, see this KB article: https://kb.vmware.com. When a virtual machine requests a read or write operation, the request is sent to the storage device. Until the request completes, it is considered an outstanding IO. Analogous to queuing. Throughput for a system is a function of IOPS and block size. High or low values are neither good nor bad. Use this metric, along with IOPS, as an indicator of relative system utilization. Latency is a function of the IO size, cache hit rate, how busy the system is, and the system configuration. Application latency requirements vary. Use this metric to understand what is normal for a workload. Peaks of excessive latency may indicate bursty workloads. Consistently high latency requires further investigation and may indicate that the system needs additional resources. Sustained congestion is not usual. In most cases, it should be near zero. It is possible to see congestion during bursts of workload activity, and this can impact response time. If the system consistently shows high levels of congestion, further analysis is required and may indicate that the system needs additional resources. While vsan is designed to handle some number of outstanding IOs, outstanding IOs impact response times. The table below shows key Virtual SAN - Backend metrics and provides guidelines on how to interpret them. These metrics show IO from vsan to the disk groups and physical disk. The backend workload includes the overhead associated with data protection and added workload for recovery and resynchronization. When viewed from a cluster level, these metrics can be used similarly to front-end metric for understanding the normal workload. At the host level, more granular information at the disk group and drive level is available that can be used for troubleshooting and to identify workload imbalance. For a complete list of metrics, see https://kb.vmware.com/. 70 Dell EMC VxRail Appliance Operations Guide

Table 6. vsan cluster level back-end workload metrics Back-end Metric IOPS Throughput Latency What it means Number of input/output operations per second. Separate read/write metrics are maintained. Additional metrics include Recovery Write IOPS and Resync Read IOPS (*vsan 6.6). Recovery writes occur during the resync of components that were impacted by a failure. Resync Read IOPS are the result of recovery operation or maintenance such as policy change, maintenance mode/evacuation, rebalancing, and so on. Throughput is a measure of the data rate and is calculated as block size times IOPS. Separate read/write throughput metrics are maintained. Additional metrics include Recovery Write and Resync Read Throughput (*vsan 6.6). Recovery writes occur during the resync of components impacted by a failure. Resync reads are the result of recovery operation or maintenance such as policy change, maintenance mode/evacuation, rebalancing, etc. How long it takes an IO operation to complete. Separate read/write latency metrics are maintained. Additional latency metrics include Recovery Write and Resync Read Latency (*vsan 6.6). How to interpret the data IOPS for a system is a function of the workload characteristics and the system configuration, including the number of nodes, number of drives, data protection type, and other data services. High or low values are neither good nor bad, rather an indicator of the relative busyness of the system. Recovery writes require further investigation. Resync Reads may be the result of normal maintenance operation and do not necessarily indicate a problem. Throughput for a system is a function of IOPS and block size. High or low values are neither good nor bad. Use this metric, along with IOPS as indicator of the relative busyness of the system. Recovery Writes require further investigation. Resync Read may be the result of normal maintenance operation and not necessary the indication of a problem. Latency on the backend is a function of the IO size, drive technology and how busy the system is. Use this metric to understand what is a normal workload. 71 Dell EMC VxRail Appliance Operations Guide

Congestion Outstanding IO Recovery writes occur during the resync of components impacted by a failure. Resync Read Throughput reads are the result of recovery operations or maintenance such as policy change, maintenance mode/evacuation, rebalancing, and so on. vsan congestion occurs when the IO rate on the lower layers (backend) cannot keep up with the IO rate on the upper layers. See KB article https://kb.vmware.com for more information. When a virtual machine requests a read or write operation, the request is sent to the storage device. Until this request is complete, it is considered an outstanding IOs. Analogous to queuing Sustained congestion is not usual and typically should be near zero. Congestion may occur during bursts of workload activity and can impact response time. If the system consistently shows high levels of congestion, further analysis is required and may indicate that the system needs additional resources. While vsan is designed to handle some number of outstanding IOs, outstanding IOs impact response times. The above tables explain cluster level metrics. Host level metrics are more granular and include details down to the disk group and physical disk level with more metrics available for analysis. Advanced workload and performance analysis Advanced performance analysis and troubleshooting may require more granular data. VMware Virtual SAN Observer, part of the Ruby vsphere Console (RVC), provides a command line interface for capturing detailed performance statistics including IOPs and latency at different layers in vsan, read cache hits and misses, outstanding IOs and congestion. This performance data can be accessed from a web browser or captured and sent to Dell EMC support for advanced analysis and troubleshooting. Creating a vsan Observer Performance Statistics bundle When working with Dell EMC support, you may be asked for a vsan Observer Performance Statistics bundle for offline analysis. To create a performance bundle, first log in to the RVC and then run the command to create a bundle for a specific timespan. The following steps outlines this procedure. 1. Open a SSH session and log in into the vcenter Server as root. If SSH is not enabled, from the vsphere web client, go to Administration -> System Configuration -> Nodes. Select the vcenter Server and enable SSH in the Access settings. 2. Connect to the Ruby vsphere Console (RVC) as the Administrator user. 72 Dell EMC VxRail Appliance Operations Guide

For VxRail Manager deployed vcenter use the following command: rvc administrator@vsphere.local@localhost For Customer Deployed vcenter that run on Windows use the following command: %PROGRAMFILES%\VMware\vCenter\rvc\rvc.bat 3. Enter the user password. 4. Navigate to the vcenter directory using the following command: cd localhost 5. Navigate to the vsan Datacenter using the following command: cd <VxRail-Datacenter> (Note: replace <VxRail-Datacenter> with the name of your VxRail DataCenter) 6. Navigate to computers using the following command: cd computers 7. Navigate to the VxRail Cluster using the following command: cd <VxRail-vSAN-Cluster-name> (Note: replace <VxRail-vSAN-Cluster-name> with the name of your VxRail Cluster) 8. Run the following command to enable vsan Observer to create a performance bundle: vsan.observer. g /tmp -o -i 60 -m 2 The output bundle will be placed in /tmp unless other location is specified. The default interval is 60 seconds unless otherwise specified. The maximum runtime is 2 hours unless otherwise specified. 9. After the collection completes, send the tar.gz file generated by vsan Observer in the /tmp directory to Dell EMC Support. The analysis of the data collected by vsan Observer is outside the scope of this document. For further information on using vsan Observer reference https://kb.vmware.com/2064240 and VMware Virtual SAN Diagnostics and Troubleshooting Reference. 73 Dell EMC VxRail Appliance Operations Guide

VxRail software upgrades All software updates to a VxRail Appliance should be applied as a bundle using VxRail Manager. Software components should not be updated individually using other tools such as VMware Update Manager unless directed to do so by Dell EMC Support. VxRail Manager and Dell EMC support are the sole sources for version control and cluster software updates. Updates are developed by Dell EMC and VMware and tested as a bundle. The bundle may include updates to one or more component including VxRail Manager, VxRail Manager deployed VCSA and PSC, VSphere ESXi hypervisor, vsan, firmware and other software. The actual components that make up a bundle vary and could be a single software component for a bug fix or a collection of components for a minor or major code upgrade. Testing and applying updates as complete bundles ensure version capability and reduces risk. Updates to both VxRail and VMware software components are applied across all nodes in a cluster using VxRail Manager. VxRail Clusters that meet the minimal configuration requirements use a fully automated and non-disruptive process that is initiated and executed entirely from VxRail Manager. The figure below summarizes the overall software upgrade workflow. Figure 39. High-level software upgrade workflow The VxRail upgrade bundles can be either downloaded from Dell EMC support or downloaded from the internet directly from VxRail Manager. The upgrade process first performs a readiness check to verify the bundle is complete, compatible with the current running versions, and to ensure the system is in healthy state before proceeding with the upgrade. When the upgrade completes, VxRail manager performs post-checks to validate that the upgrade was performed successfully. Some updates may require taking a node offline to complete. These are performed as rolling updates one node at a time. During these upgrades, the node is put into maintenance mode and workloads are evacuated to other nodes in the cluster. This is accomplished using vsphere DRS and vmotion. 74 Dell EMC VxRail Appliance Operations Guide

NOTE: Fully automated updates that require evacuation of workloads require a vsphere Enterprise Plus software license and a minimum of four nodes. Available reserve capacity must be available on the other nodes to host the workloads during the rolling upgrade. Environments with fewer than four nodes or running vsphere standard require assistance from Dell EMC support to perform an upgrade. Depending on the type of upgrade, the upgrade may be a two-step process where VxRail Manager is upgraded first followed by the upgrade of other components. Before performing any software upgrades, consult the VxRail Appliance Software Release Notes. The SolVe Desktop includes detailed instructions for specific VxRail models and installed and target software versions. 75 Dell EMC VxRail Appliance Operations Guide

System scale-out upgrades VxRail Appliance is designed to scale-up system capabilities as workloads requirements grow. You can add NICs, memory and storage capacity to existing nodes, or add nodes to the cluster. Adding storage and nodes to an existing cluster are customer procedures, with Dell EMC services available if needed. Other upgrades such as adding memory and GPUs are performed by Dell EMC Customer service. Adding a node to an existing cluster Node upgrades are shipped from the factory with ESXi and other VxRail software pre-installed. Dell EMC Professional Services install the nodes in the rack, cables it to the network switches, and connects power. Procedure 1. Power on the new appliance. Once the appliance is powered on, it broadcasts availability and VxRail Manager displays the available node in the Dashboard view of VxRail Manager. The following figure shows an example of a newly detected node ready to be added to the cluster. Figure 40. VxRail Manager Dashboard view showing new node ready to be added to cluster 2. Click Add Node to add the new nodes to the cluster. Prior to the VxRail 4.5 release, nodes were added one at-a-time. Beginning with VxRail 4.5, up to six new nodes can be selected and added in a single operation. 3. Enter the vcenter credentials. The figure below is an example screenshot of this prompt. Administrator credentials or the credentials of a user with necessary permissions to add a node to the cluster is required. 76 Dell EMC VxRail Appliance Operations Guide

Figure 41. VxRail Manager Cluster Expansion Wizard vsphere User Credentials 4. Allocate IP addresses to the cluster. Each node requires one IP address for each required network: ESXi host management, vsan and vmotion. IP address pools are created when the cluster is first initialized and can include additional IP addresses to handle future expansion. If the pools do not have sufficient available IP addresses, they can be added in the dialog shown in the figure below. Figure 42. VxRail Manager Cluster Expansion dialog Allocate new IP addresses 5. Enter the ESXi root and management account passwords. 77 Dell EMC VxRail Appliance Operations Guide

Credentials can be shared credentials across all ESXi hosts in the cluster, or each host can have its own unique credentials. This dialog shows the host names and IP addresses that will be assigned to the new nodes. An example of this dialog is shown below. Figure 43. VxRail Manager Cluster Expansion dialog Host Details DNS look-up records must be configured with IP addresses and hostnames before continuing. Confirm that the DNS records have been configured by clicking the checkbox at the bottom of the dialog. 6. Click Validate to continue. 7. If the validation succeeds, click Expand Cluster to add the new nodes. Progress is displayed in the VxRail Manager Dashboard view. When complete, the new nodes can be seen in the Cluster Health view. 8. Verify that the new nodes were added to the VxRail in the vcenter Hosts and Cluster view. Note: Any additional network configuration that was not part of the initial VxRail Appliance configuration must be manually added to this new node. 78 Dell EMC VxRail Appliance Operations Guide

Drive expansion procedure NOTE: Both SSD and HDD disk drives are fragile and cannot tolerate rough handling. There is the potential of damaging a drive when electrostatic charges that accumulate on your body discharge through the circuits. Do not attempt any replacement procedures unless you understand and follow proper ESD handling procedures. If in doubt, have Dell EMC field personnel perform the procedure. VxRail appliance can be scaled-up by adding disk to a node if there are available drive slots. Take note of the restrictions listed below. Consult your Dell EMC systems engineer for specific details. Only drives purchased from Dell EMC specifically for VxRail can be added to a VxRail appliance. Other drive types are not supported and cannot be added to a VxRail Cluster. Drives placement in the VxRail appliance is important. SSD drives to be used for cache must be installed in specific slots. This drive slot locations vary for different VxRail models. Disk group configuration rules must be followed. For example, while there may be available slots within a node, it may not be possible to add additional drives to the disk group. A disk group includes one cache drive and a minimum of one capacity drives. The maximum number of capacity drives per disk group vary for different VxRail models. The specific SSD drive types used for caching and capacity are different. While nodes within a VxRail cluster may have different drive configurations, a consistent drive configuration across all nodes in a VxRail cluster is recommended. Procedure The following procedure lists the high-level steps for adding drives to an existing VxRail node. For more detailed procedures for specific hardware configurations, see the procedures generated from the SolVe Desktop available from Dell EMC support https://support.emc.com/. 1. Within VxRail Manager, open the Health, Physical view for the node where you will be adding the drive and select the back view. 2. In the Node Information box, click Add Disks. 79 Dell EMC VxRail Appliance Operations Guide

An example is shown below. Figure 44. VxRail Manager Health -> Physical Add Disk VxRail Manager attempts to identify any drive that has been added to the system but not yet configured, and report the drive information. If you have not physically added any drives, you are asked if the drive type is a capacity or a cache drive as well as the slot number. When adding cache drives, you also create a new disk group. You must add at least one capacity drive at the same time. 3. Unpack the new drives. Carefully open the shipping carton and remove the drives from the foam carrier, one drive at a time. Open the anti-static bag containing the drives, remove the drive and place the drive on top of the bag until ready to insert the drive. 4. Remove the bezel from the server. If the bezel has a key lock, unlock the bezel with the key provided. Press the two tabs on either side of the bezel to release it from its latches and pull the bezel off the latches. 5. Install the new disks in the suggested slots. 6. In the VxRail Manager Add Disk dialog, click Continue. The newly added disk is discovered and details about the drive are displayed. This discovery process may take several minutes. 7. If the information is consistent with the disk you inserted, click Continue. VxRail Manager executes pre-checks on the new disks. 80 Dell EMC VxRail Appliance Operations Guide

8. When complete, click Continue. The figure below is an example of this dialog. Figure 45. VxRail Manager Health -> Physical Add Disk The new disk is configured on the node and added to the VxRail vsan cluster. This process can take a few minutes to complete. When the disk addition completes successfully, it shows that New disk(s) have been added successfully. 9. Click Close. 10. In the VxRail Manager HEALTH > Logical tab, select the node and scroll down to the ESXi node. Verify that the host is now reporting the new disks and that there are no errors. The figure below is an example of this dialog. Figure 46. VxRail Manager Health -> Logical 11. Log into vcenter and perform a vsan health check. Verify that there are no errors. 81 Dell EMC VxRail Appliance Operations Guide

Replacing failed components VxRail Appliances are designed to be highly resilient. All models are configured with redundant power and cooling, and use vsan data protection rules to eliminate single points of failure to keep the system operational. Disk drives, power and cooling components are hot pluggable and customer replaceable using automated procedures initiated from VxRail Manager. Other components including memory, processors and even full nodes may be replaced by Dell EMC field personnel while keeping the cluster fully operational. NOTE: Do not remove a faulty FRU until you have a replacement available. To receive a replacement component, you must open a service request with Dell EMC Support. NOTE: When handling a FRU during replacement. there is the potential of inadvertently damaging the sensitive electronic when an electrostatic charge that accumulates on your body discharges through the circuits. Do not attempt any replacement procedures unless you understand and follow proper ESD handling procedures. If in doubt, have Dell EMC field personnel perform the procedure. Identifying failed components In VxRail Manager, identify failed components on the Physical tab within the HEALTH view. You initiate replacement actions from within VxRail Manager as well. The figure below is an example of the Physical tab within the HEALTH view of VxRail Manager. A failed component is identified with Critical error. Figure 47. VxRail Health, Physical view This example shows a drive has a status of FAILED. 82 Dell EMC VxRail Appliance Operations Guide

Replacing a capacity drive VxRail systems are designed for maximum availability. Virtual machines in a VxRail cluster configured with a Failure to Tolerate SPBM rule of at least 1 tolerate a single drive failure without data unavailability or loss. vsan differentiates between a drive failing and a fully operational drive being removed from the system. When a drive fails, vsan assumes that it will not return to operation and the storage components on the failed drive are rebuilt across the surviving drives in the cluster (if there are sufficient resources). While the data is being rebuilt, the component status in vcenter appears as degraded. When a fully operational drive is removed from the system, vsan assumes that the drive will return to the system and no remediation action is taken immediately. In vcenter, the component status appears as absent. If the drive is not replaced after 60 minutes (by default), a rebuild operation is initiated to make the storage component compliant with the storage policy (if there are sufficient resources available in the cluster). Procedure The following procedure lists the high-level steps for replacing a capacity drive. For more detailed procedures for specific hardware configurations, see the procedures generated from the SolVe Desktop available from Dell EMC support https://support.emc.com/. 1. On the VxRail Manager Health page, Physical view, click the failed drive to display the disk information. 2. Click HARDWARE REPLACEMENT to initiate the drive replacement. The automated procedure verifies the system and prepares for drive replacement. 3. Remove the bezel from the VxRail appliance. If the bezel has a key lock, unlock the bezel with the key provided. Press the two tabs on either side of the bezel to release the bezel from its latches and pull the bezel off the latches. 4. Identify the drive to be replaced by the blinking locator LED. Remove the failed drive by pressing the drive release button and pulling the drive release lever completely open, and then sliding the drive and carrier out of the system. 5. Remove the failed drive from the drive carrier by removing the four small screws on the side. Install the replacement drive into the carrier and replace the four screws. 6. Install the replacement drive and carrier back into the system by carefully inserting it into the slot in the appliance until it is fully seated. Close the handle to lock it into place. 7. Follow the prompts in the VxRail Manager drive replacement procedure. The drive is verified and added back into the vsan cluster. 8. Reinstall the bezel by pushing the ends of the bezel onto the latch brackets until it snaps into place. If the bezel has a key lock, lock the bezel with the provided key and store the key in a secure place. 83 Dell EMC VxRail Appliance Operations Guide

9. On the VxRail Manager, Health page, Physical view, verify that the drive shows a Healthy status. Refresh the page if needed. 10. Log into vcenter and perform a vsan health check. a. In the navigation panel, select the vsan cluster. b. In the Monitor tab, select Virtual SAN > Health. c. Click Retest. d. Verify that there are no errors. Replacing a cache drive A VxRail cluster configured with a Failure to Tolerate SPBM of at least 1 tolerates the failure of a SSD cache without data unavailability or loss. When a cache drive fails, take the disk group that contains the failed cache drive offline. The storage components in the disk group are automatically rebuilt on other disk groups in the cluster. Procedure The following procedure lists the high-level steps involved with replacing a capacity drive. For more detailed procedure for specific hardware configuration, refer to the procedure generated from the SolVe Desktop available from Dell EMC support https://support.emc.com/. 1. On the VxRail Manager Health, Physical view, identify the failed drive and click on it to display the disk information. 2. Click HARDWARE REPLACEMENT to initiate the drive replacement. The automated procedure performs the necessary steps to verify the system and prepare for drive replacement. 3. Remove the bezel from the VxRail appliance. If the bezel has a key lock, unlock the bezel with the provided key. Press the two tabs on either side of the bezel to release the bezel from its latches and pull the bezel off the latches. 4. Identify the drive to be replaced by the blinking locator LED. Remove the failed drive by pressing the drive release button and pulling the drive release lever completely open, and then sliding the drive and carrier out of the system. 5. Remove the four small screws on the side of the failed drive and remove it from the drive carrier. 6. Install the replacement drive into the carrier and replace the four screws. 7. Install the replacement drive and carrier back into the system by carefully inserting it into the slot in the appliance until it is fully seated. Close the handle to lock it into place. 8. Follow the prompts in the VxRail Manager drive replacement procedure. The drive is verified and added back into the vsan cluster. 9. Reinstall the bezel by pushing the ends of the bezel onto the latch brackets until it snaps into place. 84 Dell EMC VxRail Appliance Operations Guide

If the bezel has a key lock, lock the bezel with the provided key and store the key in a secure place. 10. On the VxRail Manager, Health page, Physical view, verify that the drive shows a Healthy status. Refresh the page if necessary. 11. Log into vcenter and perform a vsan health check. a. In the navigation panel, select the vsan cluster. b. In the Monitor tab, select Virtual SAN > Health. c. Click Retest. d. Verify that there are no errors. Replacing a power supply A VxRail Appliance has redundant hot-swappable power supplies. You can remove a faulty Power Supply Unit (PSU) and insert a new one without having to shut down the node. Procedure The following procedure lists the high-level steps involved to replace a capacity drive. For more detailed procedures for specific hardware configurations, see the procedure generated from the SolVe Desktop available from Dell EMC support https://support.emc.com/. 1. On the VxRail Manager Health, Physical view, identify the failed component. 2. Disconnect the power cable from the power source and from the PSU. 3. Press the release latch and slide the power supply unit out of the chassis. 4. Slide the new power supply unit into the chassis until the power supply unit is fully seated and the release latch snaps into place. 5. Connect the power cable to the power supply unit and plug the cable into a power outlet. The system takes approximately 15 seconds to recognize the power supply unit and determine its status. The power supply status LED turns green signifying that it is working properly. 6. On the VxRail Manager, Health page, Physical view, verify that the error is resolved and the PSU shows a Healthy status. Refresh the page the page if needed. 85 Dell EMC VxRail Appliance Operations Guide