Service Edge Virtualization - Hardware Considerations for Optimum Performance

Similar documents
Intel Select Solution for ucpe

All product specifications are subject to change without notice.

Data Path acceleration techniques in a NFV world

Intel Select Solution for NFVI with Advantech Servers & Appliances

Flexible General-Purpose Server Board in a Standard Form Factor

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Survey of ETSI NFV standardization documents BY ABHISHEK GUPTA FRIDAY GROUP MEETING FEBRUARY 26, 2016

Full Featured with Maximum Flexibility for Expansion

Getting Real Performance from a Virtualized CCAP

Density Optimized System Enabling Next-Gen Performance

Intel s Architecture for NFV

Intel Select Solutions for Professional Visualization with Advantech Servers & Appliances

Cisco HyperFlex HX220c M4 Node

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

Accelerating 4G Network Performance

Agenda. Introduction Network functions virtualization (NFV) promise and mission cloud native approach Where do we want to go with NFV?

Agilio CX 2x40GbE with OVS-TC

How to abstract hardware acceleration device in cloud environment. Maciej Grochowski Intel DCG Ireland

Achieve Low Latency NFV with Openstack*

Altos R320 F3 Specifications. Product overview. Product views. Internal view

Enterprise Network Compute System (ENCS)

SmartNICs: Giving Rise To Smarter Offload at The Edge and In The Data Center

Enterprise Cloud Computing. Eddie Toh Platform Marketing Manager, APAC Data Centre Group Cisco Summit 2010, Kuala Lumpur

RIFT.io* and Intel Taking Virtual Network Functions to Hyperscale

WIND RIVER TITANIUM CLOUD FOR TELECOMMUNICATIONS

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

NFV Infrastructure for Media Data Center Applications

Achieve Optimal Network Throughput on the Cisco UCS S3260 Storage Server

DataON and Intel Select Hyper-Converged Infrastructure (HCI) Maximizes IOPS Performance for Windows Server Software-Defined Storage

Pioneer DreamMicro. Rackmount 1U Server - S48 Series

Benefits of Offloading I/O Processing to the Adapter

CSU 0111 Compute Sled Unit

Cisco UCS Virtual Interface Card 1225

Improving DPDK Performance

HQ-BOX 1F Embedded Industrial Server External Design Specification (EDS)

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers

Altos T310 F3 Specifications

Nested Virtualization and Server Consolidation

TALK THUNDER SOFTWARE FOR BARE METAL HIGH-PERFORMANCE SOFTWARE FOR THE MODERN DATA CENTER WITH A10 DATASHEET YOUR CHOICE OF HARDWARE

WiRack19 - Computing Server. Wiwynn SV324G2. Highlights. Specification.

Meet the Increased Demands on Your Infrastructure with Dell and Intel. ServerWatchTM Executive Brief

BIOS Parameters by Server Model

Copyright 2017 Intel Corporation

Deploying & Orchestrating ECI s Mercury ucpe on Advantech White Boxes

Optimize New Intel Xeon E based Ser vers with Emulex OneConnect and OneCommand Manager

Demonstrating Data Plane Performance Improvements using Enhanced Platform Awareness

QuickSpecs. HP Z 10GbE Dual Port Module. Models

Performance Considerations of Network Functions Virtualization using Containers

Sugon TC6600 blade server

Essentials. Expected Discontinuance Q2'15 Limited 3-year Warranty Yes Extended Warranty Available

Pioneer DreamMicro. Rackmount 2U Server S87 Series

QuickSpecs. Overview. HPE Ethernet 10Gb 2-port 535 Adapter. HPE Ethernet 10Gb 2-port 535 Adapter. 1. Product description. 2.

Cisco UCS Virtual Interface Card 1227

All product specifications are subject to change without notice.

Accelerating Telco NFV Deployments with DPDK and SmartNICs

KVM as The NFV Hypervisor

Cisco UCS B200 M3 Blade Server

Acer AW2000h w/aw170h F2 Specifications

PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate

Cisco UCS C24 M3 Server

The dark powers on Intel processor boards

CSU 0201 Compute Sled Unit

2014 LENOVO INTERNAL. ALL RIGHTS RESERVED.

Suggested use: infrastructure applications, collaboration/ , web, and virtualized desktops in a workgroup or distributed environments.

Cloud Builders. Billy Cox. Director Cloud Strategy Software and Services Group

Pioneer DreamMicro. Rackmount 1U Server - S58 Series

Ultimate Workstation Performance

AMD EPYC Processors Showcase High Performance for Network Function Virtualization (NFV)

Zhang Tianfei. Rosen Xu

SU Dual and Quad-Core Xeon UP Server

Rack Disaggregation Using PCIe Networking

All Roads Lead to Convergence

Cisco UCS C240 M3 Server

HIGH PERFORMANCE, OPEN STANDARD VIRTUALIZATION WITH NFV AND SDN

An Intelligent NIC Design Xin Song

Intel Xeon E v4, Windows Server 2016 Standard, 16GB Memory, 1TB SAS Hard Drive and a 3 Year Warranty

MWC 2015 End to End NFV Architecture demo_

DPDK Intel Cryptodev Performance Report Release 17.11

DPDK Intel Cryptodev Performance Report Release 18.08

New Levels of Performance and Interconnection for Datacenter Applications

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND FIBRE CHANNEL INFRASTRUCTURE

NEC Express5800/B120f-h System Configuration Guide

Building a Platform Optimized for the Network Edge

A Path to Line-Rate-Capable NFV Deployments with Intel Architecture and the. OpenStack* Juno Release

Technical Overview of the Lenovo X6 Servers

Pexip Infinity Server Design Guide

Intel Server. Performance. Reliability. Security. INTEL SERVER SYSTEMS

Executive Summary. Introduction. Test Highlights

OCP-T Spec Open Pod Update

Configuring SR-IOV. Table of contents. with HP Virtual Connect and Microsoft Hyper-V. Technical white paper

A-GEAR 10Gigabit Ethernet Server Adapter X520 2xSFP+

Intel Server. Performance. Reliability. Security. INTEL SERVER SYSTEMS FEATURE-RICH INTEL SERVER SYSTEMS FOR PERFORMANCE, RELIABILITY AND SECURITY

Netronome 25GbE SmartNICs with Open vswitch Hardware Offload Drive Unmatched Cloud and Data Center Infrastructure Performance

Cisco UCS C240 M3 Server

ONOS-based Data Plane Acceleration Support for 5G. Dec 4, SKTelecom

WHITE PAPER SINGLE & MULTI CORE PERFORMANCE OF AN ERASURE CODING WORKLOAD ON AMD EPYC

Supporting Fine-Grained Network Functions through Intel DPDK

SCHOOL OF PHYSICAL, CHEMICAL AND APPLIED SCIENCES

Dell Solution for High Density GPU Infrastructure

VIRTUALIZING SERVER CONNECTIVITY IN THE CLOUD

Transcription:

Service Edge Virtualization - Hardware Considerations for Optimum Performance Executive Summary This whitepaper provides a high level overview of Intel based server hardware components and their impact on virtualization. It is intended to be used as a guide for software developers and system integrators to help them in their choice of open standard networking platforms for service edge applications..

Choosing the right hardware Over the past several years, focus of server virtualization has increasingly shifted to networking space. Both wireless and wireline networking device vendors have created virtual versions of their dedicated hardware devices. This has also provided opportunities for many independent software vendors (ISVs) and new entrants to provide innovative ways to deliver networking services. This is made possible thanks to advancement in generic x86 Intel architecture-based hardware platform designs. Features such as Intel virtualization technologies (VT-x, VT-d, VT-c), Intel TXT, AES-NI, SR-IOV, PCI Express pass-through, DPDK and QuickAssist make it possible to run many workloads, that previously required specialized ASICs and Network Processor Units (NPUs), in traditional off the shelf Intel based servers. As such, a growing number of workloads can now take advantage of Intel virtualization technologies and provide even lower cost alternatives to traditional proprietary networking gear. However, virtualization has several unintended problems. One key problem is to map virtualization to hardware ie. CPU, Memory, IO and Storage. It is very common to see most software vendors using a dual socket server as a reference platform for their application. In some cases however, a standard dual socket server can be overkill for specific applications or workloads, does not offer the most appropriate configuration or hardware is simply not optimized for network function scale-up or scale-out. In such cases, customers are then left to determine how to best optimize hardware and software to match their desired workloads and throughput. Each component that makes up a system must be considered, and that includes motherboard design, BIOS, IPMI, CPU, integrated Intel VT features, memory mapping and memory type, PCIe lane mapping, integrated IO and storage. Wireless and wireline networks have already taken great advantage of virtualization. Specifically in the Radio Access Network (RAN), virtualization has provided the means for service providers to replace many services such as the Evolved Packet Core (EPC) consisting of PCRF, MME, PDN, SGW and PDN GW functions or Deep Packet Inspection (DPI ) in the GiLAN etc with Virtual Network Functions (VNFs) running on standard servers. In wireline data networks, virtualization has caused even greater disruption; many services that ran on dedicated equipment at customer premises are now replaced by a single box that can consolidate many of those services, and in some cases allow customers to move those services in to the cloud. This whitepaper provides a high level overview of Intel based server hardware components and their impact on virtualization. It is intended to be used as a guide for software developers and system integrators to help them in their choice of open standard networking platforms.

Hardware considerations NFV CPU to PCIE lane mapping With regard to non-uniform memory access (NUMA) all modern x86 multi-socket architected support NUMA in hardware. When running a Virtual Machine (VM) it should be deployed in a NUMA node. Workload performance will be degraded if it has to cross NUMA node boundaries. NUMA node memoryaffinity is typically handled by operating systems and hypervisors, however some applications may require it to be manually configured. Memory bank population In a multiprocessor system is it important pay attention to how memory DIMMs are populated. A balanced configuration can be optimized for density and/or performance. It is relatively easy to understand how to configure servers for maximum memory capacity based on the number of DIMM slots available and the type of memory allowed. But it is not always clear which factors to consider for performance optimization. Several factors need to be considered and these are enumerated below: DIMM Speed Rank The faster the memory, the lower the time memory requests have to wait before they can be executed. This could depend on application. More ranks can help applications parallelize memory requests but for applications that need low latency, this could result in lower performance CAS Latency This has to do with DRAM response time. Basically the number of clock cycles the memory controller has to wait before it can issue CAS. This goes hand in hand with DIMM speed. When selecting DIMMs, pay attention to CAS latency as well as DIMM speed, and in the resulting memory latency. ECC or non-ecc? For networking and mission critical services, ECC memory capability is an essential feature for server reliability. ECC support is mandatory in most data centers, cloud and enterprise IT infrastructures as it is a crucial element in providing enterprise customers with acceptable service levels. Memory Type UDIMMs UDIMMs are not a good option for applications that require low latency. Memory controllers access the memory on the DIMM by interfacing with each DRMA individually, resulting in low performance and capacity.

RDIMMs In the RDIMM case, the memory controller interfaces with the register on the DIMM. The register then communicates with actual DRMA. This adds an extra clock cycle but allows for better latency and density than on UDIMMs. LRDIMMs For LRDIMMs, the register is replaced by an isolation memory buffer (imb by Inphi).The memory buffer re-drives all of the data, command, address and clock signals from the host memory controller and provides them to the multiple ranks of DRAM. LRDIMM supports rank multiplication, where multiple physical DRAM ranks appear to the host controller as a single logical rank of a larger size. However, the relative high cost of LRDIMM is a major limitation. Intel Virtualization Technologies (VT) Intel VT provides hardware offload features for CPU virtualization, memory virtualization, I/O virtualization, Intel Graphics virtualization and Virtualization of Security and Network functions. Intel VT-x HW assist for Virtualization VT-x previously codenamed Vanderpool provides CPU and chipset features to enable virtualization. The following is a brief explanation of key CPU and memory virtualization features: Intel Flex Priority Improves Virtual Machine access to the Task Priority Register (TPR). This reduces IO performance and eliminates the overheard that most VMs have with regards to guest TPR. It is designed to accelerate VMM interrupt handling and thereby improve overall performance. Intel Flex Migration This allows the Virtual-Machine Monitor (VMM) to abstract available processor features and report a consistent set of features to the guest OS (VM). Thanks to Intel Flex Migration, the VM can seamlessly migrate from one generation of Intel processor to the next or even across different families of Intel processors. CPUID Virtualization This allows a VMM to virtualize CPUID instruction. With this feature VMM can migrate to different machine/processor that support different features. Virtual Processor IDs (VPID) With this feature VMMs can support multiple VMs. With VPID active transition, the look-aside buffer (TLB) is not flushed on VM entry or exit. Performance benefits are workload dependent.

Guest Pre-emption Timer With this features VMM can preempt execution of a guest OS. It is programmable timer and can be set to cause a VM to exit when the timer expires. It also allows the VMM to add quality of service (QoS) features in specific implementations. Descriptor Table Exiting Allows a VMM to protect a guest from internal attack. This is useful feature for VM and security agents to enhance security features. Pause-Loop Exiting (PLE) To make sure no single vcpu that is running a VM is locking up or holding a spin-lock. With this feature spin-locks are detected and PLE allows the VMM to exit the lock-holding VM. Extended page Table (EPT) This allows virtualization of physical memory. When EPT is in use, certain physical addresses are treated as VM-physical addresses and are not used to access memory directly. Instead, VM-physical addresses are translated by navigating through a set of EPT paging structures to produce physical addresses that are used to access memory. Intel VT for directed IO (VT-d) This feature is implement in the chipset. When Intel VT-d is enabled, the guest OS can choose to use either the traditional approach or, as needed, pass-through devices. In pass-through mode, the PCI* device is not allocated by the hypervisor and, therefore, the device can be allocated directly by a VM which now sees the physical PCI device. Intel VT for Directed I/O provides VMM software with the following capabilities: I/O device assignment: for flexibly assigning I/O devices to VMs and extending the protection and isolation properties of VMs for I/O operations. DMA remapping: for supporting address translations for Direct Memory Accesses (DMA) from devices. Interrupt remapping: for supporting isolation and routing of interrupts from devices and external interrupt controllers to appropriate VMs. Interrupt posting: for supporting direct delivery of virtual interrupts from devices and external interrupt controllers to virtual processors. Reliability: for recording and reporting of DMA and in interrupt errors to system software that may otherwise corrupt memory or impact VM isolation.

Intel Virtualization Technology for Connectivity (Intel VT-c) Intel Virtualization Technology (Intel VT) for Connectivity (Intel VT-c) enables lower CPU utilization, reduced system latency, and improved networking throughput. VT-c is a key feature of many server class Intel network controllers. Intel VT-c consist of the following main virtualization technologies: Intel TXT Virtual Machine Device Queues (VMDQ) VMDQ offloads the sorting burden from the VMM to the network controller, to accelerate network I/O throughput. This is a hardware assist feature of Intel networking silicon that improves VM networking performance and lowers VMM CPU utilization. Single-Root I/O Virtualization (SR-IOV) SR-IOV is a part of the PCI Special Interest Group (PCISIG) specification. SR-IOV provides a standard method for sharing a single PCIe device concurrently with multiple VMs. But the more interesting feature is the ability to allow for creation of multiple virtual functions from a single PCIe function. Each virtual function gets a slice of the physical function and can be allocated to a VM using standard interfaces. Intel Trusted Execution Technology (Intel TXT) provides a hardware-based security foundation on which to build and maintain a chain of trust to protect the platform from software-based attacks. In virtualization, Intel TXT can help prevent spread of infected VMs from one machine to another. For example in case of a compromised VM, Intel TXT can prevent migration of this VM to another machine by creating "trusted pools", thus limiting VM migration. Intel AES-NI AES-NI is set of six new instructions that leverage the fast processing of Rijndael algorithms (AES) in hardware. For networking application this feature can encrypt and decrypt messages and data without consuming precious CPU cycles. AES-NI is available with all major hypervisors and supports AES-NI usage in guest OSs. Intel DDIO Intel Data Direct IO Technology allows Intel network adapters to talk directly with the processor cache. By allowing direct access to cache instead of first accessing main memory. Intel DDIO is available on all Intel Xeon E5 processors and has no hardware, OS or Hypervisor dependency. Application and system integrators that want to take advantage of this feature need to make sure IO devices/add-in cards i.e. Ethernet, RAID controllers etc. are built with DDIO technology. For many NFV use cases DDIO can greatly improve performance as data from the NIC is available faster for processing and service chaining. Intel CMT/CAT Intel Cache Monitoring Technology / Cache Allocation Technology (CMT/CAT) is available on all Intel Xeon Processor D and Intel Xeon Processor E5-2600 v4 (and some v3) SKUs. CMT/CAT allows hypervisors to determine cache utilization by application. This provides an even playing field for multiple VNF or VMs running on the same processor. CMT can be used to monitor cache occupancy per VNF or VM, whereas CAT allows control of a CPUs shared last level cache (LLC). With CMT and CAT, applications can implement QoS by metering and monitoring VNFs or VMs i.e. take action if a VNF is using

less than or greater than x% of LLC. This technology also provides a way to detect noisy or aggressive VNFs or VMs and implement alarms and actions. DPDK The Data Plane Developer Kit (DPDK) is probably the most important sofwtare for transforming a generic Intel processor in to a packet processing engine. DPDK is an absolutely must-have for any networking application that requires 10 Gbps or higher line-rate throughput. In virtualized environments, DPDK can be used for accelerating virtual switches (vswitch) as well as applications and VNFs. DPDK has very strong open source community support via http://dpdk.org, http://openvswitch.org and https://fd.io. For SDN and NFV deployment, DPDK-accelerated Open vswitch, or ovs-dpdk, is a key component for applications that require line rate performance. Fast Data I/O (Fido) which leverages Vector Packet Processing (VPP) technology from Cisco takes this a step further with virtual switch and virtual router functionality based on the DPDK. This is good news for application developers trying to develop high bandwidth, low latency and line rate applications. Intel QuickAssist Technology Intel QuickAssist technology (QAT) provides hardware offload for crypto functions and compression. This is typically implemented via an onboard chipset or an add-in PCIe adaptor card. Some Intel System-on-a- Chip (SoC) processors have embedded QAT support including Intel Xeon based SoCs. Current generation Intel QAT supports SR-IOV with up to 32 virtual functions. Each virtual function can be assigned to individual VNFs or VMs, bypassing the hypervisor and virtual switch resulting in line rate performance for crypto and compression applications.

Advantech Platforms designed for Network Service Edge virtualization Advantech understands the need for hardware designs that are optimized for network specific workloads. Customers planning to deploy SDN and NFV application can rely on Advantech for application-ready platforms leveraging all the features of Intel processors, chipsets and LAN controllers. Each platform is designed with one of following key ingredients: Latest generation Intel Xeon Processors Intel Vt-x (Vt-d, Vt-c), AES-NI, TXT, CMT/CAT, DDIO Balanced DDR4 Memory Balanced PCIe lane mapping and IO pinning PCIe IO with SR-IOV Intel QuickAssist Technology DPDK Next Gen CPU Figure 1 Platform Selection Guide

FWA-3260 Intel Xeon Processor D System On Chip up to 16 cores and 1.5MB last level cache per core 4 x DDR4 ECC UDIMMs/RDIMMs and RDIMMs, up to 2400MHz and up to 128GB 4 server class GbE ports implemented by an Intel i350 Ethernet Controller with advanced LAN bypass 2 GbE management ports 2 x 10GE SFP+ ports 2 Network Mezzanine Card bays for PCIe gen.3 based port expansion with 1GbE and 10GbE and 40GbE ports 1 x PCIex8 full-height / half-length add-on cards Two 2.5" SATA HDDs/SSDs and two M.2 SSD sockets IPMI2.0 compliant Remote Management (optional) FWA-5020 Single or dual Intel Xeon E5-2600 v4 processor(s) up to 145W TDP DDR4 2400 MHz ECC up to 512 GB (CPU SKU) 4 x GbE with LAN bypass, 2 x GbE for Mgmt (SKU dependent) 2 x 10GbE SFP+ NICs (SKU dependent) Up to 4 x NMC (Network Mezzanine Card) slots for a wide range of GbE, 10GbE and 40GbE NMCs with or without advanced LAN bypass 2 x 2.5" SATA HDDs/SSDs IPMI 2.0 compliant Remote Management Advanced Platform Reliability and Serviceability 2 x internal CLC PCIE card (Dual DH8955) supported (SKU dependent) FWA-6520 2 x Intel Xeon E5-2600 v3/v4 processors up to 145W TDP DDR4 1866/2133 ECC memory up to 512GB PCIe gen. 3 support Up to 8 x NMC (Network Mezzanine Card) slots for a wide range of GbE, 10GbE and 40GbE NMCs with or without advanced LAN bypass 2 x PCIe x16 slots support FH/HL add-on cards 2 x 2.5" removable external SATA HDDs/SSDs IPMI 2.0 compliant Remote Management Advanced Platform Reliability and Serviceability