Copyright 2006 Penton Media, Inc., All rights reserved. Printing of this document is for personal use only. Reprints

Similar documents
by Brian Hausauer, Chief Architect, NetEffect, Inc

iscsi Technology: A Convergence of Networking and Storage

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

The NE010 iwarp Adapter

Multifunction Networking Adapters

RoCE vs. iwarp Competitive Analysis

Introduction to TCP/IP Offload Engine (TOE)

An RDMA Protocol Specification (Version 1.0)

Advanced Computer Networks. End Host Optimization

QuickSpecs. HP Z 10GbE Dual Port Module. Models

NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications

Ethernet: The High Bandwidth Low-Latency Data Center Switching Fabric

The Case for RDMA. Jim Pinkerton RDMA Consortium 5/29/2002

Memory Management Strategies for Data Serving with RDMA

10-Gigabit iwarp Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G

Birds of a Feather Presentation

Future Routing Schemes in Petascale clusters

CERN openlab Summer 2006: Networking Overview

Use of the Internet SCSI (iscsi) protocol

HP Cluster Interconnects: The Next 5 Years

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

iscsi or iser? Asgeir Eiriksson CTO Chelsio Communications Inc

What is RDMA? An Introduction to Networking Acceleration Technologies

NFS/RDMA. Tom Talpey Network Appliance

Application Acceleration Beyond Flash Storage

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS

Accelerating Real-Time Big Data. Breaking the limitations of captive NVMe storage

A Study of iscsi Extensions for RDMA (iser) Patricia Thaler (Agilent).

USING ISCSI AND VERITAS BACKUP EXEC 9.0 FOR WINDOWS SERVERS BENEFITS AND TEST CONFIGURATION

Introduction to Ethernet Latency

Unified Wire Networking Solutions

Evaluating the Impact of RDMA on Storage I/O over InfiniBand

Learn Your Alphabet - SRIOV, NPIV, RoCE, iwarp to Pump Up Virtual Infrastructure Performance

Comparative Performance Analysis of RDMA-Enhanced Ethernet

Storage Protocol Offload for Virtualized Environments Session 301-F

Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

iser as accelerator for Software Defined Storage Rahul Fiske, Subhojit Roy IBM (India)

Universal RDMA: A Unique Approach for Heterogeneous Data-Center Acceleration

Comparing Server I/O Consolidation Solutions: iscsi, InfiniBand and FCoE. Gilles Chekroun Errol Roberts

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Optimizing Performance: Intel Network Adapters User Guide

Brent Callaghan Sun Microsystems, Inc. Sun Microsystems, Inc

Lossless 10 Gigabit Ethernet: The Unifying Infrastructure for SAN and LAN Consolidation

SoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

Infiniband and RDMA Technology. Doug Ledford

2017 Storage Developer Conference. Mellanox Technologies. All Rights Reserved.

FPGA Augmented ASICs: The Time Has Come

Chelsio 10G Ethernet Open MPI OFED iwarp with Arista Switch

TCP offload engines for high-speed data processing

Best Practices for Deployments using DCB and RoCE

Optimizing LS-DYNA Productivity in Cluster Environments

Evaluation of the Chelsio T580-CR iscsi Offload adapter

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb.

IsoStack Highly Efficient Network Processing on Dedicated Cores

Key Measures of InfiniBand Performance in the Data Center. Driving Metrics for End User Benefits

Technology Insight Series

Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms

2008 International ANSYS Conference

Low Latency Server Virtualization

NVMe Direct. Next-Generation Offload Technology. White Paper

InfiniBand Networked Flash Storage

Introduction to High-Speed InfiniBand Interconnect

High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK

PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate

QuickSpecs. Models. HP NC380T PCI Express Dual Port Multifunction Gigabit Server Adapter. Overview

Benefits of full TCP/IP offload in the NFS

Concurrent Support of NVMe over RDMA Fabrics and Established Networked Block and File Storage

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

NVMe over Universal RDMA Fabrics

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research

Introduction to Infiniband

Generic RDMA Enablement in Linux

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

OceanStor 9000 InfiniBand Technical White Paper. Issue V1.01 Date HUAWEI TECHNOLOGIES CO., LTD.

REAL SPEED. Neterion : The Leader in 10 Gigabit Ethernet adapters

Optimize New Intel Xeon E based Ser vers with Emulex OneConnect and OneCommand Manager

PCI Express x8 Quad Port 10Gigabit Server Adapter (Intel XL710 Based)

Outrunning Moore s Law Can IP-SANs close the host-network gap? Jeff Chase Duke University

BlueGene/L. Computer Science, University of Warwick. Source: IBM

ARISTA: Improving Application Performance While Reducing Complexity

QuickSpecs. Overview. HPE Ethernet 10Gb 2-port 535 Adapter. HPE Ethernet 10Gb 2-port 535 Adapter. 1. Product description. 2.

INT G bit TCP Offload Engine SOC

Motivation CPUs can not keep pace with network

MPA (Marker PDU Aligned Framing for TCP)

DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience

Informatix Solutions INFINIBAND OVERVIEW. - Informatix Solutions, Page 1 Version 1.0

Module 2 Storage Network Architecture

Comparing Ethernet & Soft RoCE over 1 Gigabit Ethernet

OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI

QuickSpecs. Models. HP NC510C PCIe 10 Gigabit Server Adapter. Overview

Annual Update on Flash Memory for Non-Technologists

Introduction to iscsi

LUSTRE NETWORKING High-Performance Features and Flexible Support for a Wide Array of Networks White Paper November Abstract

EqualLogic Storage and Non-Stacking Switches. Sizing and Configuration

Enyx soft-hardware design services and development framework for FPGA & SoC

Reduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection

The Road to ExaScale. Advances in High-Performance Interconnect Infrastructure. September 2011

Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters

Transcription:

Page 1 of 6 Enter Search Term Enter Drill Deeper or ED Online ID Technologies Design Hotspots Resources Shows Magazine ebooks & Whitepapers Jobs More Click to view this week's ad screen [ D e s i g n V i e w / D e s i g n S o l u t i o n ] Break Through The TCP/IP Bottleneck With iwarp Using a high-bandwidth, low-latency network solution in the same network infrastructure provides good insuran generation networks. Sweta Bhatt, Prashant Patel ED Online ID #21970 October 22, 2009 Copyright 2006 Penton Media, Inc., All rights reserved. Printing of this document is for personal use only. Reprints T he online economy, particularly e-business, entertainment, and collaboration, continues to dramatically and rapidly increase the amount of Internet traffic to and from enterprise servers. Most of this data is going through the transmission control protocol/internet protocol (TCP/IP) stack and Ethernet controllers. As a result, Ethernet controllers are experiencing heavy network traffic, which requires more system resources to process network packets. The CPU load increases linearly as a function of network packets processed, diminishing the CPU s availability for other applications. Because the TCP/IP consumes a significant amount of the host CPU processing cycles, a heavy TCP/IP load may leave few system resources available for other applications. Techniques for reducing the demand on the CPU and lowering the system bottleneck, though, are available. iwarp SOLUTIONS Although researchers have proposed many mechanisms and theories for parallel systems, only a few have resulted in working com One of the latest to enter the scene is the Internet Wide Area RDMA Protocol, or iwarp, a joint project by Carnegie Mellon Univer Corp. This experimental parallel system integrates a very long instruction word (VLIW) processor and a sophisticated finegrained commu on a single chip. It s basically a suite of wireless protocols comprising RDMAP and DDP. The iwarp protocol suite may be layered PDU aligned (MPA) and TCP or over Stream Control Transmission Protocol (SCTP) or other transport protocols (Fig. 1). The RDMA Consortium released the iwarp extensions to TCP/IP in October 2002, implementing the standard for zerotransmissio TCP/IP. Together, these extensions eliminate the three major sources of networking transport (TCP/IP) processing, intermediate application context switches that collectively account for nearly 100% of CPU utilization (see the table). A kernel implementation of the TCP stack has several bottlenecks. Therefore, a few vendors now implement TCP in hardware. Bec

Page 2 of 6 loses are rare in tightly coupled network environments, software may perform the error-correction mechanisms of TCP. Meanwhile on the network interface card (NIC) strictly handles the more frequently performed communications. This additional hardware is kn offload engine (TOE). The iwarp extensions utilize advanced techniques to reduce CPU overhead, memory bandwidth utilization, and latency. This is a through a combination of offloading TCP/IP processing from the CPU, eliminating unnecessary buffering, and dramatically reducing operating-system (OS) calls and context switches. Thus, the data management and network protocol processing is offloaded to an Ethernet adapter instead of the kernel s TCP/IP stack. iwarp COMPONENTS Offloading TCP/IP (transport) processing: In conventional Ethernet, the TCP/IP stack is a software implementation, putting a treme host server s CPU. Transport processing includes tasks such as updating TCP context, implementing required TCP timers, segme reassembling the payload, buffer management, resource-intensive buffer copies, and interrupt processing. The CPU load increases linearly as a function of the network packets processed. With the tenfold increase in performance from 1-G 10-Gigabit Ethernet, packet processing and the CPU overhead related to transport processing increases up to tenfold as well. In th processing will cripple the CPU well before reaching the Ethernet s maximum throughput. The iwarp extensions enable the Ethernet to offload transport processing from the CPU to specialized hardware, eliminating 40% overhead attributed to networking (Fig. 2). The transport offload can be implemented by a standalone TOE or be embedded in an a Ethernet adapter that supports other iwarp accelerations. Moving transport processing to an adapter also rids a second source of overhead intermediate TCP/IP protocol stack buffer copie these copies from system memory to the adapter memory saves system memory bandwidth and lowers latency. RDMA techniques eliminate buffer copy: Repurposed for Internet protocols by the RDMA Consortium, Remote DMA (RDMA) and D Placement (DDP) techniques were formalized as part of the iwarp extensions. RDMA embeds information into each packet that d application memory buffer with which the packet is associated. This enables the payload to be placed directly in the destination app even when packets arrive out of order. Data can now move from one server to another without the unnecessary buffer copies traditionally required to gather a complete This is sometimes called the zero copy model. Together, RDMA enables an accelerated Ethernet adapter to support direct-memo from/writes-to application memory, eliminating buffer copies to intermediate layers. RDMA and DDP eliminate 20% of CPU overhea networking and free the memory bandwidth attributed to intermediate application buffer copies. Avoiding application context switching/os bypass: The third and somewhat less familiar source of overhead, context switching, con significantly to overhead and latency in applications. Traditionally, when an application issues commands to the I/O adapter, the co transmitted through most layers of the application/os stack. Continue to page 2 Passing a command from the application to the OS requires a CPU intensive context switch. When executing a context switch, the the application context in system memory, including all of the CPU general-purpose registers, floating-point registers, stack pointer pointer, and all of the memory-management-unit state associated with the application s memory access rights. Then the OS contex loading a similar set of items for the OS from system memory. The iwarp extensions implement OS bypass (user-level direct access), enabling an application executing in user space to post co to the network adapter (Fig. 4). This eliminates expensive calls to the OS, dramatically reducing application context switches. An a Ethernet adapter handles tasks typically performed by the OS. Such adapters are more complex than traditional non-accelerated N eliminate the final 40% of CPU overhead related to networking. POTENTIAL APPLICATIONS Clustered servers/blades depend on high-bandwidth, low-latency interconnects to aggregate the enormous processing power of do and keep I/O needs serviced. For these applications, iwarp is advantageous because it delivers: Predictable, sustained, scalable performance, even in multicore, multi-processor clusters A single, self-provisioning adapter that supports all clustering MPIs: HP MPI, Intel MPI, Scali MPI, MVAPICH2 Modern low-latency interconnect technologies to an ultra-lowpower, high-bandwidth Ethernet package; flexibility, cost, and indust management benefits; and exceptional processing power per square foot for clustered applications. For data-networking applications, iwarp-based accelerated Ethernet adapters offer full-function data-center NIC capabilities that b performance, improve power efficiency, and more fully utilize data-center assets. The accelerated Ethernet adapters achieve the h lowest latency, and lowest power consumption in the industry.

Page 3 of 6 In particular, when it comes to network overhead processing, it achieves 95% CPU availability for the application and operating sys hardware-class performance for virtualized applications ensures offloaded CPU cycles from network processing are not lost to soft overhead. For storage vendors, iwarp will become a standard feature of network-attached storage (NAS) and Internet small computer syste (iscsi) storage access networks (SANs). NAS and iser-based (iscsi Extensions for RDMA) storage networks utilizing accelerate adapters deliver the highestperformance, lowest-overhead storage solutions at any given wire rate. At 10 Gbits/s, Ethernet-based storage networks offer a viable alternative to a Fibre Channel SAN. There s a single network interfac file protocols, high-throughput block, and file-level storage access, exceeding 8-Gbit Fibre Channel. In addition to iscsi, iwarp supports a wide range of other storage protocols such as network file system (NFS) and common Inte (CIFS). This gives data centers the opportunity to reap the increased productivity and lowered total cost of ownership benefits of a standards-based technology. Like InfiniBand, iwarp does not have a standard programming interface, only a set of verbs. Unlike the InfiniBand architecture (IB has reliable connected communication as this is the only service that TCP and SCTP provide. The iwarp specification also omits special features of IBA, such as atomic remote operations. In all, iwarp offers the basics of InfiniBand applied to Ethernet. This su legacy software and next-generation applications. fig 1 fig 2

Page 4 of 6 fig 3 fig 4

Page 5 of 6 table 1 PartFinder GlobalSpec Find real-time pricing, stock status, same-day/next-day shipping options and more. Brought to you by Digi-Key. Go to PartFinder. Enter Search Term Here PART SEARCH : Powered by: Marketplace Sponsored Links Single Board Compu Compliant with PICMG Industrial multiple slot buy.advantech.com.my/ Embedded System E Arm7,Arm9,Microchip, Pro-embedded Develo www.sky-tech.com.tw MCF547x Embedded MQX RTOS, TCP/IP, U Free Software Evaluat www.embedded-access.

Page 6 of 6 Electronic Design's Pop Quiz Sponsored by Intel. The embedded Internet is bringing transformation to the embedded world. The era of intelligent connectivity is dawning. Are you in? Take quiz to test your knowledge. Share your idea for an intelligent, connected embedded device and you could win $10,000! At Intel Connect This! you can submit your idea, view and comment on other ideas, and vote for your favorites. Looking For Custom CML Provides Custom Integrated asic Solutio www.cmlmicro.com/asic Keithley 2010 Test & Measurement Product Catalog: Free reference contains info and specs on Keithley's test solutions, plus tutorials and selector guides that simplify choosing the optimum solutions. FREE Tektronix Fundamentals of Radar Measurements Primer This primer addresses the needs for pulse generation and measurements, how pulses are generated and how the automated measurement are made. Electronic Design Europe Electronic Design China EEPN Power Electronics Auto Electronics Microw Mobile Dev & Design Schematics Find Power Products Military Electronics EE Events Related Re Planet EE Network Contact Us Editorial Calendar Media Kit Submit Articles Headlines Subscribe Advanced Search Help Site Map E-mail Webmaster Copyright 2009 Penton Media, Inc., All rights reserved. Privacy