The QLogic 8200 Series is the Adapter of Choice for Converged Data Centers

Similar documents
8Gb Fibre Channel Adapter of Choice in Microsoft Hyper-V Environments

QLogic 16Gb Gen 5 Fibre Channel for Database and Business Analytics

QLogic 10GbE High-Performance Adapters for Dell PowerEdge Servers

QLogic/Lenovo 16Gb Gen 5 Fibre Channel for Database and Business Analytics

NVMe over Universal RDMA Fabrics

QLogic 2500 Series FC HBAs Accelerate Application Performance

QLE10000 Series Adapter Provides Application Benefits Through I/O Caching

Benefits of Offloading I/O Processing to the Adapter

Cavium FastLinQ 25GbE Intelligent Ethernet Adapters vs. Mellanox Adapters

Visualize I/O Connectivity for VMware vsphere

Disabling the FCoE Function on HPE FlexFabric Adapters from QLogic. Products Affected. HPE FlexFabric 10Gb 2-port 534FLR-SFP B21

Configuring Cavium SmartAN for HPE Ethernet 10/25GbE Adapters. Products Supported. HPE Ethernet 10/25Gb 621SFP28 Adapter B21

FC-NVMe. NVMe over Fabrics. Fibre Channel the most trusted fabric can transport NVMe natively. White Paper

NVMe Direct. Next-Generation Offload Technology. White Paper

Emulex OCe11102-N and Mellanox ConnectX-3 EN on Windows Server 2008 R2

Emulex LPe16000B 16Gb Fibre Channel HBA Evaluation

Demartek September Intel 10GbE Adapter Performance Evaluation for FCoE and iscsi. Introduction. Evaluation Environment. Evaluation Summary

Network Function Virtualization Using Data Plane Developer s Kit

Application Note Setting the QL41xxx Link Speed Using QCS CLI

Evaluation of the Chelsio T580-CR iscsi Offload adapter

VM Aware Fibre Channel

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays

Deployment Guide: Network Convergence with Emulex OneConnect FCoE CNA and Windows Server Platform

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

Introduction to Ethernet Latency

QLogic TrueScale InfiniBand and Teraflop Simulations

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays

Emulex LPe16000B Gen 5 Fibre Channel HBA Feature Comparison

Cisco UCS Virtual Interface Card 1225

Use of the Internet SCSI (iscsi) protocol

Performance Report: Multiprotocol Performance Test of VMware ESX 3.5 on NetApp Storage Systems

Low Latency Evaluation of Fibre Channel, iscsi and SAS Host Interfaces

SAN Virtuosity Fibre Channel over Ethernet

Version 1.0 October Reduce CPU Utilization by 10GbE CNA with Hardware iscsi Offload

Comparison of Storage Protocol Performance ESX Server 3.5

QLogic FastLinQ QL41232HMKR

Cisco UCS Virtual Interface Card 1227

Emulex Universal Multichannel

QLogic 2700 Series. Gen 6 FC (32GFC) Fibre Channel Adapters. Data Sheet OVERVIEW

ARISTA: Improving Application Performance While Reducing Complexity

Optimize New Intel Xeon E based Ser vers with Emulex OneConnect and OneCommand Manager

Gen 6 Fibre Channel Evaluation of Products from Emulex and Brocade

HP BladeSystem c-class Ethernet network adapters

A Principled Technologies deployment guide commissioned by QLogic Corporation

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

EMC Virtual Infrastructure for Microsoft Exchange 2010 Enabled by EMC Symmetrix VMAX, VMware vsphere 4, and Replication Manager

Microsoft Office SharePoint Server 2007

Cisco Nexus 5000 and Emulex LP21000

QLogic FastLinQ QL45462HLCU 40GbE Converged Network Adapter

Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays

Evaluation of Chelsio Terminator 6 (T6) Unified Wire Adapter iscsi Offload

Rack-Level I/O Consolidation with Cisco Nexus 5000 Series Switches

Evaluation of Multi-protocol Storage Latency in a Multi-switch Environment

HP Embeds Converged Network Adapters

NAS for Server Virtualization Dennis Chapman Senior Technical Director NetApp

QLogic QLE2694 and QLE2694L

Extremely Fast Distributed Storage for Cloud Service Providers

QLogic BR-815 and BR-825 Fibre Channel Adapters

PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate

FastLinQ QL41162HLRJ. 8th Generation 10Gb Converged Network Adapter with iscsi, FCoE, and Universal RDMA. Product Brief OVERVIEW

Emulex -branded Fibre Channel HBA Product Line

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers

USING ISCSI AND VERITAS BACKUP EXEC 9.0 FOR WINDOWS SERVERS BENEFITS AND TEST CONFIGURATION

All Roads Lead to Convergence

Surveillance Dell EMC Storage with Bosch Video Recording Manager

Reference Architectures for designing and deploying Microsoft SQL Server Databases in Active System800 Platform

Lossless 10 Gigabit Ethernet: The Unifying Infrastructure for SAN and LAN Consolidation

EMC Integrated Infrastructure for VMware. Business Continuity

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND FIBRE CHANNEL INFRASTRUCTURE

16GFC Sets The Pace For Storage Networks

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System

Exchange Server 2007 Performance Comparison of the Dell PowerEdge 2950 and HP Proliant DL385 G2 Servers

Dell and Emulex: A lossless 10Gb Ethernet iscsi SAN for VMware vsphere 5

Active System Manager Release 8.2 Compatibility Matrix

Dell EMC SAP HANA Appliance Backup and Restore Performance with Dell EMC Data Domain

QuickSpecs. HP Z 10GbE Dual Port Module. Models

10Gb iscsi Initiators

2014 LENOVO INTERNAL. ALL RIGHTS RESERVED.

EMC Business Continuity for Microsoft Applications

FCoE at 40Gbps with FC-BB-6

SAS Technical Update Connectivity Roadmap and MultiLink SAS Initiative Jay Neer Molex Corporation Marty Czekalski Seagate Technology LLC

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND ISCSI INFRASTRUCTURE

Accelerate Applications Using EqualLogic Arrays with directcache

Reference Architecture

Virtuozzo Hyperconverged Platform Uses Intel Optane SSDs to Accelerate Performance for Containers and VMs

VIRTUALIZING SERVER CONNECTIVITY IN THE CLOUD

HP SAS benchmark performance tests

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

DATA PROTECTION IN A ROBO ENVIRONMENT

HP BladeSystem c-class Ethernet network adaptors

Best Practices for Deployments using DCB and RoCE

Technology Insight Series

HP Converged Network Switches and Adapters. HP StorageWorks 2408 Converged Network Switch

Broadcom Adapters for Dell PowerEdge 12G Servers

Comparing UFS and NVMe Storage Stack and System-Level Performance in Embedded Systems

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage

Deploy a Next-Generation Messaging Platform with Microsoft Exchange Server 2010 on Cisco Unified Computing System Powered by Intel Xeon Processors

SAN Design Best Practices for the Dell PowerEdge M1000e Blade Enclosure and EqualLogic PS Series Storage (1GbE) A Dell Technical Whitepaper

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage

An Oracle White Paper December Accelerating Deployment of Virtualized Infrastructures with the Oracle VM Blade Cluster Reference Configuration

Transcription:

The QLogic 82 Series is the Adapter of QLogic 1GbE Converged Network Adapter Outperforms Alternatives in Dell 12G Servers QLogic 82 Series Converged Network Adapter outperforms the alternative adapter in throughput per CPU utilization for both networking and converged traffic while using only a fraction of the host resources, leaving plenty of room for data center servers to efficiently scale applications and virtual servers. KEY FINDINGS The QLogic 82 Series 1Gb Converged Network Adapter offers greater efficiency in scaling (including maximum throughput and overall CPU efficiency) compared to alternative Converged Network Adapters. The QLogic adapter better supports the additional demands of a converged, virtualized data center. Our results demonstrate the following: Bandwidth: The QLogic adapter can achieve near line rate (38Gbps) of bidirectional throughput with low (a little more than 1 percent) CPU utilization. Networking : The QLogic adapter offers up to 49 percent higher networking throughput efficiency while requiring up to 31 percent less CPU resources compared to the alternative adapter. Teaming: In a link aggregation configuration, QLogic s adapter delivers up to 27 percent or 3.9Gbps higher throughput as workloads increase, as well as up to 49 percent higher throughput efficiency (Mbps/CPU utilization) compared to the alternative adapter. FCoE : For real-world applications (typically 4/8KB block sizes) QLogic s adapters outperform the alternative adapter by up to 53.9 percent. EXECUTIVE SUMMARY Leading the data center charge in I/O consolidation is Fibre Channel over Ethernet (FCoE). FCoE promises to unify data and storage traffic onto a single wire. As FCoE helps to enable data center consolidation, CPU efficiency related to I/O is emerging as a key factor in maximizing network consolidation. This is true for a couple of reasons: Consolidation creates denser server environments, which in turn drive higher throughput requirements for business-critical applications. By lowering CPU requirements to process I/O, virtualization ratios can be maximized to consolidate applications on fewer servers, thereby accomplishing low total cost of ownership (TCO). High throughput with low CPU utilization is the key to scaling within next-generation data centers. A Converged Network Adapter can easily accomplish this by offloading I/O processing for IP, iscsi, and FCoE from the host CPU. At the time this paper was first published, the QLogic 82 Series Converged Network Adapter was the only Converged Network Adapter to offload Fibre Channel over Ethernet (FCoE), iscsi, and IP (Ethernet) traffic concurrently. SN33987- Rev. C 2/17 1

The QLogic 82 Series is the Adapter of TEST CONFIGURATIONS AND PROCEDURES The testing discussed in this paper analyzes and compares both QLogic and competitive Converged Network Adapters in a representative selection across Ethernet and FCoE simulations of real-world workloads. Ethernet Configuration The industry s leading IxChariot test tool was used to simulate real-world applications and to evaluate each adapter s performance under realistic load conditions. IxChariot was used due to its ability to accurately assess the performance characteristics of an adapter running within a data center network. Chariot measurements were taken to determine the total throughput (Mbps) across increasing workloads (threads) and the percentage of CPU required. Throughput was then averaged across each thread count to estimate the CPU efficiency (Mbps/CPU). CPU efficiency provides an excellent estimate of the overhead imposed on the processor by running the significant networking traffic and handling network interrupts. During these tests, the workloads were increased by adding threads to reach a maximum of 2. As a result, these tests provide significant insight into each adapter s potential networking performance and remaining headroom to efficiently scale. Appendix A provides additional information regarding the test configuration (such as servers and driver versions). The testing demonstrated the QLogic adapter s performance advantages, which provide significantly greater CPU efficiency and an enhanced ability to scale in a virtualized environment compared to the alternative adapter. Ethernet Test Procedure 1. A QLogic 1Gb Ethernet dual-port PCIe 2. to Converged Network Adapter () was installed on the test server using the latest released driver. 2. IxChariot was set up with two End Point (EP) agents installed as remote agents, and an Ethernet Switch was installed between them. Each remote agent was set up to create and measure network traffic. 3. An IxChariot console was then attached to each EP to indicate what to transmit, when to transmit, and what data to collect. One End Point was designated as a client and the other as a server. Dual-port NIC Performance Testing each port on the server was configured and connected to two separate client servers. Teaming/failover Testing the dual-port QLogic adapter within the server was then configured for teaming with port as the active primary port and port1 as the secondary failover port. 4. The type of data transmission to model was selected. The type chosen was the default high performance throughput script. 5. The testing was run for standard MTU size with multiple threads. The NIC dual-port bi-directional scenario utilized both ports simultaneously to measure the total throughput capabilities and the CPU efficiency of a dual-port adapter. The failover scenario testing used bi-directional traffic across one active port while the second port was in active standby mode. 6. To simulate the failover, the cable was pulled from port on the server during data transmission while the test script was running. 7. These steps were repeated for the other Converged Network Adapter. FCoE Configuration The IOmeter I/O test tool was used to benchmark the QLogic 82 Series 1Gb Converged Network Adapter versus the alternative. The tool was chosen because of its ability to simulate real-world applications and to evaluate adapter and system performance under realistic storage workload conditions. The test accurately assesses the performance characteristics of an adapter running on a converged network. IOmeter measurements were taken for the total amount of transactions (IOPS), the total throughput (MBps) across real-world applications (block sizes of 4 and 8KB), and the percentage of CPU required to drive overall transactions. IOPS were then normalized by the CPU usage over the testing period to calculate the CPU efficiency (IOPS/CPU). A higher number indicates more efficient use of CPU resources for I/O. Appendix A provides additional information regarding the test configuration (such as servers and driver versions). Figure 1. Networking Test Configuration SN33987- Rev. C 2/17 2

The QLogic 82 Series is the Adapter of FCoE Test Procedure 1. A QLogic 1Gb Ethernet dual-port PCIe 2. to Converged Network Adapter () was installed on the test server using the latest released driver. 2. The adapter was then connected to a Cisco Nexus 52 switch and then to two RamSans configured with a total of eight LUNs. 3. The IOmeter was used to test for FCoE workload scalability. It was configured at a queue depth of 32 with a total of 16 workers. 4. To help simulate real-world environments, test data were run for standard storage application block sizes of 4 and 8 for sequential read (Seq RD), sequential write (Seq WR), and sequential read/write (Seq WR). 5. These steps were then repeated for the alternative Converged Network Adapter. The test results demonstrate the superiority of the QLogic offload engine to relieve CPUs of protocol processing tasks so that servers can remain highly scalable while providing high-performance data and storage applications the I/O bandwidth they require across virtualized networks. Ethernet Test Results The test results demonstrated that the QLogic 82 Series Adapters scaled more efficiently in an enterprise data center with real-world application workloads compared to the alternative Converged Network Adapters. For fully utilized NIC performance testing, dual-port bi-directional testing configuration ran traffic at full line rate across both adapter ports simultaneously. This was true for a teaming configuration simulating a high-availability environment with failover activated in a dual-port configuration with bi-directional traffic running at full line-rate speeds, and through increasing workloads. Maximum Bandwidth with Low CPU The QLogic 82 Series Converged Network Adapter is capable of delivering near line-rate performance from a dual-port, 1Gb adapter over a PCI Express Gen2 bus. This equates to more than 36Gbps (given a marginal loss for bus latencies and encoding overhead). When compared to alternatives, QLogic s dual-port 82 Series Adapter delivers more than 36Gbps of aggregate bidirectional throughput when sending 15B frames and uses approximately 14 percent of the host server s CPU (Figure 3). The total bandwidth and CPU utilization were captured for both adapters. The results show that QLogic has the advantage over the alternative; QLogic improves server performance by providing line-rate network bandwidth while limiting CPU overhead. 4 35 3 25 14% CPU 22% CPU Mbps 2 15 Figure 2. FCoE Test Configuration TEST RESULTS The test procedures were specifically designed to test data and storage I/O running over a converged 1Gb Ethernet network. In each test case, specific attention was given to the CPU utilization in relation to the I/O performance. These steps were taken to create an environment in which an enterprise-class Converged Network Adapter must perform; for example, transferring large amounts of data for an application with a high bandwidth demand while the CPU of the servers is being highly taxed by enterprise applications and virtual machines (VMs). 1 5 Figure 3. Dual-port Bi-directional Throughput and CPU SN33987- Rev. C 2/17 3

The QLogic 82 Series is the Adapter of NIC AND TEAMING PERFORMANCE NIC Dual-port Bi-directional Case First, the overall Ethernet performance of the adapters were evaluated. To do this, a real-world test scenario was created using both ports of a dualport adapter. Bi-directional data was sent through each simultaneously. The testing verified that the QLogic adapter again holds a significant advantage in CPU efficiency, averaging more than a 31 percent advantage over the alternative between 2 2 threads (Figure 4). CPU % 25. 2. 15. 1. 5.. 1 2 4 8 12 Figure 4. CPU Utilization Again, this becomes even more evident when looking at the throughput normalized by the percentage of the CPU that is required to drive the transactions. To calculate this number, the total throughput transmitted and received by both ports was divided by the average CPU percentage required to drive the I/O. The results show the QLogic adapter s advantage over the alternative adapter to deliver high throughput rates while minimizing CPU utilization (Figure 5). The QLogic adapter holds an average efficiency advantage of 49 percent over the alternative adapter. CPU effectiveness in regards to I/O will emerge as a key factor in maximizing data center efficiencies as well as consolidation efforts. This stands true for a couple of reasons. First, consolidation creates denser server environments, which in turn drive higher throughput requirements for business-critical applications. Second, by lowering CPU requirements to process I/O, virtualization ratios can be maximized to fully optimize servers. The QLogic 82 Series Converged Network Adapters provide the scalability advantages that will be required by future enterprise-class data centers. NIC Teaming Failover Case To test the ability of an adapter to meet the high availability and high throughput requirements of an enterprise data center, measurements were taken for total throughput of all data sent and received across an increasing workload. During this time, a failover state was replicated to demonstrate a high-availability environment. This data was then averaged across the CPU usage consumed over the testing period to determine the adapter s efficiency. Average throughput by overall CPU usage provides an excellent estimate of the overhead imposed by the adapter while running the benchmark and handling network interrupts. In addition, it serves to demonstrate each adapter s networking performance capabilities and potential to efficiently scale. During the testing, the QLogic 82 Series Adapter outperformed the alternative adapter by an average of up to 27 percent or 3.9Gbps over increasing workloads from 4 2 threads with bi-directional send and receive I/O (Figure 6). 18 16 14 12 3 25 MBps 1 8 6 Mbps/CPU % 2 15 1 4 2 4 8 12 16 2 5 1 2 4 8 12 Figure 6. Bi-directional I/O Throughput with Increasing s Figure 5. Dual-port Bi-directional CPU Efficiencies SN33987- Rev. C 2/17 4

The QLogic 82 Series is the Adapter of In the bi-directional NIC failover test, the QLogic adapter required substantially less CPU resources than the alternative adapter to drive more data. In fact, the QLogic adapter used on average a little more than five percent of the CPU 18 percent less than the amount the alternative adapter required (Figure 7). CPU % 25. 2. 15. 1. 5.. 4 8 12 16 2 Figure 7. CPU Usage in a Teamed I/O Environment Due to their low overhead, QLogic adapters leave the host server with plenty of CPU resources for the server to efficiently scale. This is imperative in next-generation data centers that run converged networks and/or VMs. This becomes even more evident when considering the throughput normalized by the percentage of the CPU that is required to drive the I/O (Figure 8). FCoE Test Results To evaluate FCoE storage I/O performance, QLogic created a real-world test scenario to verify an adapter s ability to deliver high throughput rates while minimizing CPU utilization. Using this standard testing philosophy, QLogic tested our 82 Series Converged Network Adapter against a leading alternative Converged Network Adapter within a Windows Server 28 environment, with a focus on comparing IOPS relative to CPU utilization. Measurements were done for throughput (MBps) across real-world block sizes of 4 and 8KB and a comparison was performed of the percentage of CPU utilization required to drive the I/O. Test results are reported as MBps divided by the percentage of CPU utilization to obtain the CPU efficiency. A higher number indicates more efficient use of CPU resources. As shown in Figure 9, a Converged Network Adapter from each company was used to obtain performance results for sequential read, sequential write, and sequential read/write I/O operations. Block sizes of 4 and 8KB, which are typical of Exchange and database workloads, were measured, and the results show that the QLogic 82 Series Adapter significantly outperforms the alternative adapter. The data confirms that with sequential 8KB read/write block sizes, which simulate application workloads for Oracle and Microsoft Exchange 27, the QLogic 82 Converged Network Adapter holds a 54 percent CPU efficiency advantage over the alternative CNA. 3 25 3 25 MBps/CPU % 2 15 1 5 IOPs/CPU 2 15 1 5 4 8 12 16 2 s Figure 8. Bi-directional I/O CPU Efficiency with Increasing s 4KB 8KB 4KB 8KB 8KB Seq RD Seq WR Seq RW Block Size Figure 9. Dual-port FCoE Throughput Divided by CPU Percentage 2. 18. 16. 14. CPU Percentage 12. 1. 8. 6. 4. 2.. 4KB 8KB 4KB 8KB Seq RD Seq RW Block Size Figure 1. FCoE CPU Utilization SN33987- Rev. C 2/17 5

The QLogic 82 Series is the Adapter of SUMMARY AND CONCLUSION Consolidation onto a single unified network Fibre Channel over Ethernet reduces physical complexity, lowers material costs, and simplifies operations. Ultimately, though, the most important benefit of a unified network is the gained efficiency in resource utilization. This is true as networks are consolidated onto a single protocol supporting all enterprise requirements, along with backward compatibility of each across existing hardware and management applications. However, if the wrong Converged Network Adapter is chosen to standardize on, efficiency for high throughput demands of applications, density of virtual servers, and overall consolidation efforts will fall short. As leading CIOs and data center managers strive to achieve maximum efficiency through resource utilization, QLogic delivers high-performance I/O solutions that minimize CPU resources, allowing efficient scaling of the entire data center. This is especially important for virtualized environments that support increasingly powerful applications as well as growing amounts of data. QLogic furnishes key infrastructure components with the greatest offload capabilities to provide high-performance data and storage applications the I/O bandwidth they require and to ensure that server resources are available when needed. Data center fabrics based on QLogic technology will increase enterprise IT efficiencies derived from the ability to expand the number of simultaneous applications or virtualized operating systems that can be run on a given platform. APPENDIX A Nehalem-based Server Operating System QLogic Converged Network Adapter -SR IxChariot Nehalem-based Server Operating System NIC Testing Configuration Server Configuration DL38 G6 Quad core dual 2.93Ghz 12GB RAM Windows Server 28 R2 Converged Network Adapters Driver: 4.2.15 Network Test Tool Firmware: 4.7.31 7. build 114 FCoE Testing Configuration Server Configuration Test Tool Default Chariot high-performance throughput script DL38 G6 Quad core dual 2.8Ghz 24GB RAM IOmeter 26.7.27 QLogic Converged Network Adapter -SR Windows Server 28 R2 Converged Network Adapters Driver: 9.1.9.15 Firmware: 4.7.31 Storage Configuration RamSan 325 Switch Firmware: 3.2.6-p6 Cisco Nexus 52 BIO version 1.2. Follow us: Share: Corporate Headquarters Cavium, Inc. 2315 N. First Street San Jose, CA 95131 48-943-71 International Offices UK Ireland Germany France India Japan China Hong Kong Singapore Taiwan Israel 217 QLogic Corporation. QLogic Corporation is a wholly owned subsidiary of Cavium, Inc. All rights reserved worldwide. QLogic and the QLogic logo are registered trademarks of QLogic Corporation. Cisco and Cisco Nexus are trademarks, registered trademarks, and/or service marks of Cisco Systems, Inc. IxChariot is a registered trademark of Ixia. PCIe and PCI Express are registered trademarks of PCI-SIG Corporation. RamSan is a registered trademark of Texas Memory Systems. Microsoft Exchange and Windows Server are registered trademarks of Microsoft Corporation. Oracle is a registered trademark of Oracle Corporation. All other brand and product names are trademarks or registered trademarks of their respective owners. This document is provided for informational purposes only and may contain errors. QLogic reserves the right, without notice, to make changes to this document or in product design or specifications. QLogic disclaims any warranty of any kind, expressed or implied, and does not guarantee that any results or performance described in the document will be achieved by you. All statements regarding QLogic s future direction and intent are subject to change or withdrawal without notice and represent goals and objectives only. SN33987- Rev. C 2/17 6