6WINDGate. White Paper. Packet Processing Software for Wireless Infrastructure

Similar documents
Lowering Cost per Bit With 40G ATCA

eclipse packet node aviat networks transforming networks to all-ip

Alcatel-Lucent 9500 Microwave Packet Radio (ETSI Markets)

Satellite-Based Cellular Backhaul in the Era of LTE

Seven Criteria for a Sound Investment in WAN Optimization

Business Case for the Cisco ASR 5500 Mobile Multimedia Core Solution

Mobile Network Evolution

Enabling Efficient and Scalable Zero-Trust Security

New Approach to Unstructured Data

TALK THUNDER SOFTWARE FOR BARE METAL HIGH-PERFORMANCE SOFTWARE FOR THE MODERN DATA CENTER WITH A10 DATASHEET YOUR CHOICE OF HARDWARE

White Paper. Massive Capacity Can Be Easier with 4G-Optimized Microwave Backhaul

Integrating Communications Compliance into the Next Generation 4G LTE Network

Accelerating 4G Network Performance

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Total Cost of Ownership Analysis for a Wireless Access Gateway

Network Edge Innovation With Virtual Routing

Leveraging WiMAX into LTE Success

RADWIN IP Backhaul Solutions. Application Brochure. Meeting the escalating demand for IP backhaul

COMPUTING. Centellis Virtualization Platform An open hardware and software platform for implementing virtualized applications

Metering Re-ECN: Performance Evaluation and its Applicability in

OPEN COMPUTE PLATFORMS POWER SOFTWARE-DRIVEN PACKET FLOW VISIBILITY, PART 2 EXECUTIVE SUMMARY. Key Takeaways

Intel Network Builders Solution Brief. Etisalat* and Intel Virtualizing the Internet. Flexibility

Broadcom Adapters for Dell PowerEdge 12G Servers

Virtual Switch Acceleration with OVS-TC

Cisco Unified Computing System Delivering on Cisco's Unified Computing Vision

Cisco ASR 5500 Multimedia Core Platform

Towards 5G RAN Virtualization Enabled by Intel and ASTRI*

Broadcast-Quality, High-Density HEVC Encoding with AMD EPYC Processors

40 GbE: What, Why & Its Market Potential

Abstract of the Book

LTE CONVERGED GATEWAY IP FLOW MOBILITY SOLUTION

Was ist dran an einer spezialisierten Data Warehousing platform?

VOLTE and the IP/MPLS Cell Site Evolution

Tunneling Configuration Guide for Enterprise

PeerApp Case Study. November University of California, Santa Barbara, Boosts Internet Video Quality and Reduces Bandwidth Costs

TetraNode Scalability and Performance. White paper

Header Compression Capacity Calculations for Wireless Networks

The path toward C-RAN and V-RAN: benefits and challenges from operator perspective

Network Design Considerations for Grid Computing

PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate

Lossless 10 Gigabit Ethernet: The Unifying Infrastructure for SAN and LAN Consolidation

Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances

PXI - An ideal platform for a variety of industrial applications

Virtual WAN Optimization Controllers

Passive optical LAN explained

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

THE OPEN DATA CENTER FABRIC FOR THE CLOUD

Virtual WAN Optimization Controllers

Use of the Internet SCSI (iscsi) protocol

Executive Summary. Introduction. Test Highlights

7 Myths & Facts of Wireless Backhaul IP Migration

Agenda. Introduction Network functions virtualization (NFV) promise and mission cloud native approach Where do we want to go with NFV?

BUILDING A NEXT-GENERATION FIREWALL

Leverage SDN Principles in LTE to Meet Future Network Demands

KeyStone C66x Multicore SoC Overview. Dec, 2011

Cisco 4000 Series Integrated Services Routers: Architecture for Branch-Office Agility

wi4 Fixed Point-to-Multipoint Canopy Solutions

Cisco SCE 2020 Service Control Engine

Business Aspects of FibeAir IP-20C

100 Gbps Open-Source Software Router? It's Here. Jim Thompson, CTO, Netgate

iscsi Technology: A Convergence of Networking and Storage

HigH THrougHpuT MeeTs Low TCo. THe etherhaul TM wireless BaCkHauL. 4Gon Tel: +44 (0) Fax: +44 (0)

Why Performance Matters When Building Your New SD-WAN

IPv6: a real opportunity for ISPs

Performance Solution for NFV Reduces Cost-per-bit and Simplifies Service Deployment

Large SAN Design Best Practices Using Cisco MDS 9700 and MDS 9500 Multilayer Directors

Wireless IP for IoT / M2M 101 The Basics

Best Practices for Setting BIOS Parameters for Performance

Technology Insight Series

LANCOM Techpaper Routing Performance

- Page 1 of 8 -

Cavium FastLinQ 25GbE Intelligent Ethernet Adapters vs. Mellanox Adapters

Making Enterprise Branches Agile and Efficient with Software-defined WAN (SD-WAN)

A Low Latency Solution Stack for High Frequency Trading. High-Frequency Trading. Solution. White Paper

DAY 2. HSPA Systems Architecture and Protocols

Benefits of Offloading I/O Processing to the Adapter

Circuit Emulation Service

Dell EMC ScaleIO Ready Node

Optimizing Apache Spark with Memory1. July Page 1 of 14

Virtual Evolved Packet Core (VEPC) Placement in the Metro Core- Backhual-Aggregation Ring BY ABHISHEK GUPTA FRIDAY GROUP MEETING OCTOBER 20, 2017

We are Network Security. Enterprise Solutions.

ARISTA: Improving Application Performance While Reducing Complexity

Smarter Systems In Your Cloud Deployment

Wireless IP for M2M / IoT 101

Rack-Level I/O Consolidation with Cisco Nexus 5000 Series Switches

RAN Sharing NEC s Approach towards Active Radio Access Network Sharing

IBM Real-time Compression and ProtecTIER Deduplication

OpenMPDK and unvme User Space Device Driver for Server and Data Center

Implementation of Software-based EPON-OLT and Performance Evaluation

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Virtualization of the MS Exchange Server Environment

2 to 4 Intel Xeon Processor E v3 Family CPUs. Up to 12 SFF Disk Drives for Appliance Model. Up to 6 TB of Main Memory (with GB LRDIMMs)

Network Function Virtualization Using Data Plane Developer s Kit

How Architecture Design Can Lower Hyperconverged Infrastructure (HCI) Total Cost of Ownership (TCO)

Casa Systems Axyom Software Platform

Throughput Considerations for Wireless Networks

OmniSwitch 6850E Stackable LAN Switch

NEC Virtualized Evolved Packet Core vepc

Deploying Data Center Switching Solutions

Introduction of ASTRI s Network Functions Virtualization (NFV) Technologies. ASTRI Proprietary

Transcription:

Packet Processing Software for Wireless Infrastructure Last Update: v1.0 - January 2011

Performance Challenges for Wireless Networks As advanced services proliferate and video consumes an ever-increasing share of wireless network capacity, the requirements for high-performance processing of network traffic will continue to grow dramatically. Each piece of equipment in the network must achieve higher levels of packet processing performance. At the same time, the equipment must be designed to meet challenging power, cost and schedule requirements. This white paper discusses how specialized software, designed for high-performance processing of network packets and optimized for multicore processors, enables system designers to meet the conflicting goals of high traffic rates, low system power and minimum system cost. The explanation will be illustrated using real-world examples of 4G equipment based on multicore Intel Architecture (IA) platforms and will leave readers with a good understanding of how to use advanced multicore packet processing techniques effectively in next-generation networking equipment. Designers of 4G telecom infrastructure products, whether LTE or WiMAX, face challenging performance requirements that cannot be addressed with the same techniques that worked for 2G and 3G equipment. Core network traffic doubles every year Driven by high-bandwidth Internet applications, the total traffic in the core network is growing at over 100% per year, so service providers expect individual network elements such as packet gateways to provide at least a corresponding increase in bandwidth. At the same time, telecom equipment is increasingly deployed in commercial and outdoor environments without forced-air cooling, placing severe restrictions on the number of high-performance processor subsystems that can be used. Finally, equipment suppliers operate under ever more challenging cost constraints. These apply both to CAPEX, since low product cost is essential to support worldwide deployments of 4G networks, and to OPEX, where electrical power, both to run the equipment and for cooling, is a major contributor to the calculation of overall Total Cost of Ownership (TCO). To be successful, developers of 4G networking equipment must deliver solutions that achieve maximum throughput for tomorrow s network traffic patterns (dominated by video and data), while minimizing system-level power consumption and cost.

Packet processing fundamentals Packet Processing Software Solutions For 4G networks, 3GPP has specified a flat IP-based network architecture (SAE: System Architecture Evolution) with the goal of efficiently supporting massive usage of IP services. As a consequence, the network architecture is much simpler than existing architectures such as 3G. However, as all the services (data, voice, video ) use IP packets, processing these packets efficiently becomes critical to ensure LTE system performance. On top of the IP protocol itself (actually the two IP protocols as the SAE architecture supports both IPv4 and IPv6 versions), a large number of individual protocols have to be implemented: Low-level protocols such as IPsec (Internet Protocol Security), ROHC (Robust Header Compression) and VLAN (Virtual LAN). Within an overall 4G network, a number of protocols support communication between individual subsystems. For example, GTP (GPRS Tunneling Protocol) carries user data via IP tunnels between a Signaling Gateway (SGW) and a base station (enodeb). Similarly, SCTP (Stream Control Transmission Protocol) implements signaling between the Mobility Management Entity (MME), the SGW and the enodeb. Likewise, IPinIP, GRE (Generic Routing Encapsulation) or GTPu provide tunnel connections from the SGW to the Packet Gateway (PGW). And there are many more protocols that are used throughout the network. Differentiating the services is also critical. IP QoS is required to prioritize realtime traffic over pure data traffic. Similarly, packet inspection implements the mechanisms to identify the user traffic to provide a better service to users and/or applications. All these protocols are encapsulated in IP packets. Starting from layer 2 protocols, packet processing software has to analyze successive encapsulated headers as fast as possible. The critical performance challenge for 4G networking equipment is to process these IP packets at the highest possible throughput. In general, the designer s objective is to perform this processing fast enough that the throughput of the equipment is limited, not by the packet processing performance, but by the speed of the physical network connection, typically 10Gb/s, 40Gb/s or, soon, 100Gb/s. If the processing throughput matches the speed of the network, the system is said to be performing at wire-speed, maximizing the efficiency of the equipment. Over the past few years, developers of high-end processors migrated to multicore architectures in order to meet never-ending needs for increased performance in networking equipment and a constant evolution of customized protocols. The traditional processor design approach of continually increasing clock frequencies in order to boost performance led to prohibitive processor power consumption, since power is proportional to the square of the clock frequency. The industry adopted multicore architectures in which the cores run at a clock frequency that leads to manageable power consumption for the processor as a whole. Today, all processors used in high-performance networking products are based on multicore architectures. These platforms provide the ideal environment for implementing the high-performance packet processing that is required for 4G equipment. For developers of networking equipment, selecting a multicore processor for their system is only one step in designing a high-performance system solution. Generally, the more complex question is how to architect the software which, as explained above, typically needs to process packets from multiple streams of network traffic at wire-speed.

A standard networking stack uses services provided by the operating Operating System and is subject to significant overheads associated with functions such as preemptions, threads, timers and locking. These processing overheads are imposed on each packet passing through the system, resulting in a major performance penalty for overall throughput. Furthermore, although some improvements can be made to an OS stack to support multicore architectures, performance fails to scale linearly over multiple cores for complex packet processing such as required by 4G and a processor with, for example, eight cores may not process packets significantly faster than one with two cores for GTPu-to-GRE encapsulations. All in all, a standard OS stack does a poor job of exploiting the potential packet processing performance of a multicore processor. A superior solution is provided by specialized packet processing software optimized for multicore architectures. In a well-designed implementation, the networking stack is split into two layers. The lower layer, typically called the fast path, processes the majority of incoming packets outside the OS environment and without ncurring any of the OS overheads that degrade overall performance. Only those rare packets that require complex processing are forwarded to the OS networking stack, which performs the necessary management, signalingg functions. and control A multicore processor is well-suited to implementing this kind of software architecture. Most of the cores can be dedicated to running the fast path, in order to maximize the overall throughput of the system, while only one core is required to run the OS, the OS networking stack and the application s control plane. In practice, the designer will analyze the specific performance requirements for the various software elements in the system (applications, control plane, networking stack and fast path), deciding on the most appropriate allocation of cores to balance the overall system workload. Until recently, the only restriction when configuring the platform was that, since the cores running the fast path were running outside the OS, they had to be dedicated exclusively to the fast path and not shared with other software. With the recent evolution towards a hybrid fast path model, the system can now be reconfigured dynamically as traffic patterns change in order to share the CPU resources allocated to the control plane and the fast path. Splitting the networking stack in this way has no impact on the functionality of application software, which interfaces to the same OS networking stack as previously. Existing applications do not need to be rewritten or recertified, but they run significantly faster because the underlying packet processing is accelerated through the fast path environment.

The gold standard for packet processing software Introducing 6WINDGate Software for Multicore IA Platforms In a typical 4G application such as a packet gateway (PGW) or switching gateway (SGW), when the standardd OS networking stacks are replaced by optimized packet processing software based on the fast path concept, the networking performance of the processor subsystems will typically increase by seven to ten times. This massive increase in performance means the system will be able to manage 7x to 10x more users with the same hardware. This type of fast path- to based implementationn can allow the designer meet system throughput goals that may have been unachievable on a single multicore processor when using a standard OS stack. These compelling breakthroughs in system performance also translate directly into improvements in energy efficiency and cost. The 6WINDGate packet processing software product implements the type of fast path architecture described above and has been deployed by wireless infrastructure equipment providers worldwide. With a comprehensive set of protocols available for the control plane, the networking stack and the fast path, 6WINDGate provides developers with a single-vendor solution for all the protocols required for a high-performance wireless infrastructure platform based on multicore technology. By removing the need for developers to integrate networking software components from multiple suppliers, 6WINDGate has been proven to accelerate the time-to-market for networking equipment by up to twelve months. Architecturally, 6WINDGate is a drop-in replacement for standard Linux networking stacks and is fully-compatible with standard Linux application APIs. Any existing Linux applications, such as LTE or WiMAX applications, will run unchanged when migrated to a multicore system using 6WINDGate. This allows OEMs to preserve their investment in proprietary or third-party applications while fully benefitting from the performance, system cost and time-to-market advantages provided by 6WINDGate. When installed on a multicore Intel Architecture platform, 6WINDGate can be configured at run-time to make the optimum use of the number cores available.

In the example shown below, the six-core Intel Xeon processor E5645 is used as follows: One core is configured to run Linux and the LTE or WiMAX application stack, as well as the 6WINDGate control plane and the 6WINDGate networking stack; The remaining five cores are configured to run the 6WINDGate fast path, which makes full use of processor-specific services provided by the Intel Data Plane Development Kit (Intel DPDK) software. Scaleable, extensible multicore software architecture It s important to note that the 6WINDGate software is fully extensible, with support for multi-processor architectures. Multiple processors can be configured to provide the required level of performance (both on fast path and Linux protocols). The following section explains the system-level performance achieved by this software configuration.

Industry-leadinpacket processing performance on multicore IA processors 6WINDGate Performance in Wireless Infrastructure Equipment As a starting-point for understanding the system-level performance that the 6WINDGate software provides for wireless infrastructure equipment, it s instructive to examine IP forwarding performance. This can be used to evaluate the raw capabilities of the platform. 6WIND has recently demonstrated 10Gb/s Ethernet IP forwarding performancee using the 6WINDGate packet processing software running on a 2.4 GHz Intel Xeon processor E5645. On this platform, 6WINDGate delivers approximately 13.9 Mpps of IP Forwarding performance on a single core, while the architecture of the software ensures that the performance scales linearly according to the number of cores configured to run the fast path (subject, of course, to any finite throughput limits imposed by hardware constraints). Clearly, though, 4G wireless infrastructure equipment requires the implementation of functions far beyond basic IP forwarding. The 6WINDGate packet processing solution is ideal for this becausee it includes a wide selection of protocols optimized for multicore platforms. (Of course, processing a large number of protocols requires more processor cycles and the overall performance measured in packets per second decreases with the complexity of the protocols used.) A typical system includes a fast path implementation of VLAN, IP forwarding, GTPu tunneling, flow accounting and QoS conditioner functions, all using 6WINDGate. For this workload, an Intel Xeon processor E5645 core running at 2.4 GHz is able to process around 2.5 Mpps/core. Assuming the average size of an IP packet is 512 bytes, each core is thereforee able to process traffic at a rate of 10.6 Gbps. Since the average bandwidth per LTE user is typicallyy around 1.2 Mbps, this implies that a single Intel Xeon processor E5645 core running the 6WINDGate packet processing software can handle the traffic of approximately 8,860 active users.

620,000 active LTE users Based on this analysis, a telecom blade based on a single six-core Intel Xeon processor E5645 blade with five cores configured to run the 6WINDGate fast path will be able to manage around 44,000 active LTE users. With the new generation of eight-core processors, the performance per blade will reach 62,000 active LTE users (assuming seven cores running the fast path). Typical telecom infrastructure systems, such as packet gateways and switching gateways, comprise a chassis that includes multiple identical processor blades, supported by common resources for system management, power and I/O. The most relevant LTE performance number from the point of view of a service provider is the total number of users supported by a complete chassis. Using the analysis of processor-level and blade-level performance outlined above, it is straightforward to extrapolatee the performance (or user capacity) of a fullyequippped chassis. Assuming blades based on Intel Xeon processors running 6WINDGate, with the maximum available number of cores configured to run the fast path, simple calculations show thatt a chassis equipped with 10 blades willl handle 440,000 active LTE users ( six-core processors) or 620,000 active LTE users (eight-core processors).

Deployed by tier-1 OEMs worldwide in 4G networking equipment 6WINDGate Benefits Summarized This white paper has illustrated some of the key benefits that the 6WINDGate packet processing software provides for developers of high-performance 4G wireless equipment. These benefits include: Optimized support for industry-leading multicore processor platforms such as the Intel Xeon processor E5645 running the Intel Data Plane Development Kit (Intel DPDK) software; Portable software architecture, eliminating any dependency on a single processor or CPU vendor; Best-in-class packet processing performance, delivering seven to ten times the performance of standard OS networking stacks, enabling the development of 4G networking equipment that meets challenging performance requirements; Comprehensive set of 40+ optimized networking protocols, ideally suited to 4G equipment, eliminates the need to integrate networking software components from multiple suppliers; Full compatibility with standard OS APIs simplifies software development, integration and migration; Built-in support for High-Availability frameworks enables Carrier Grade system reliability; Full compatibility with all commercial Linux distributions for maximum flexibility in software platform design; Award-winning technology with best-in-class technical support, already deployed in 4G networking equipment by tier-1 OEMs worldwide. Conclusions 4G equipment needs to achieve a breakthrough level of packet processing performance in order to provide advanced services for high numbers of users. While multicore processor platforms are capable of delivering impressive raw performance, standard OS networking stacks cannot reach the necessary throughput. The 6WINDGate software achieves a 7x to 10x improvement in packet processing performance and enables OEMs to meet 4G performance requirements. Because 6WINDGate is compatible with standard APIs and includes a comprehensive suite of optimized networking protocols, developers can accelerate their time-to-market by up to twelve months while reusing their existing software. For more information, please visit www.6wind.com.