DevoFlow: Scaling Flow Management for High-Performance Networks

Similar documents
DevoFlow: Scaling Flow Management for High Performance Networks

SOFTWARE DEFINED NETWORKS. Jonathan Chu Muhammad Salman Malik

T9: SDN and Flow Management: DevoFlow

Micro load balancing in data centers with DRILL

Concise Paper: Freeway: Adaptively Isolating the Elephant and Mice Flows on Different Transmission Paths

CoSwitch: A Cooperative Switching Design for Software Defined Data Center Networking

Mahout: Low-Overhead Datacenter Traffic Management using End-Host-Based Elephant Detection. Vasileios Dimitrakis

Managing Failures in IP Networks Using SDN Controllers by Adding Module to OpenFlow

DFFR: A Distributed Load Balancer for Data Center Networks

Software Defined Networks and OpenFlow. Courtesy of: AT&T Tech Talks.

Slicing a Network. Software-Defined Network (SDN) FlowVisor. Advanced! Computer Networks. Centralized Network Control (NC)

Delay Controlled Elephant Flow Rerouting in Software Defined Network

SPAIN: High BW Data-Center Ethernet with Unmodified Switches. Praveen Yalagandula, HP Labs. Jayaram Mudigonda, HP Labs

Utilizing Datacenter Networks: Centralized or Distributed Solutions?

SPARTA: Scalable Per-Address RouTing Architecture

API Design Challenges for Open Router Platforms on Proprietary Hardware

Interactive Monitoring, Visualization, and Configuration of OpenFlow-Based SDN

National Taiwan University. Software-Defined Networking

Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks

NaaS Network-as-a-Service in the Cloud

Interconnected Multiple Software-Defined Network Domains with Loop Topology

vcrib: Virtualized Rule Management in the Cloud

Machine-Learning-Based Flow scheduling in OTSSenabled

Jellyfish. networking data centers randomly. Brighten Godfrey UIUC Cisco Systems, September 12, [Photo: Kevin Raskoff]

Application of SDN: Load Balancing & Traffic Engineering

Hardware Evolution in Data Centers

Accelerating OpenFlow SDN Switches with Per-Port Cache

SCALING SOFTWARE DEFINED NETWORKS. Chengyu Fan (edited by Lorenzo De Carli)

Building Efficient and Reliable Software-Defined Networks. Naga Katta

Scalability and Resilience in SDN: an overview. Nicola Rustignoli

Application-Aware SDN Routing for Big-Data Processing

Advanced Computer Networks. RDMA, Network Virtualization

THE increasing reliance on streaming mobile video from

CAB: A Reactive Wildcard Rule Caching System for Software-Defined Networks

Informatica Universiteit van Amsterdam. Distributed Load-Balancing of Network Flows using Multi-Path Routing. Kevin Ouwehand. September 20, 2015

Longer is Better: Exploiting Path Diversity in Data Center Networks

THE increasing reliance on streaming mobile video from

DiffFlow: Differentiating Short and Long Flows for Load Balancing in Data Center Networks

Advanced Computer Networks. Network Virtualization

Data Center Network Topologies II

Sinbad. Leveraging Endpoint Flexibility in Data-Intensive Clusters. Mosharaf Chowdhury, Srikanth Kandula, Ion Stoica. UC Berkeley

Evaluation of OpenFlow s Enhancements

Feature Rich Flow Monitoring with P4

SwitchReduce : Reducing Switch State and Controller Involvement in OpenFlow Networks

DeTail Reducing the Tail of Flow Completion Times in Datacenter Networks. David Zats, Tathagata Das, Prashanth Mohan, Dhruba Borthakur, Randy Katz

Research Article A Multicontroller Load Balancing Approach in Software-Defined Wireless Networks

The War Between Mice and Elephants

DARD: A Practical Distributed Adaptive Routing Architecture for Datacenter Networks

OpenFlow Performance Testing

Decision Forest: A Scalable Architecture for Flexible Flow Matching on FPGA

Software Defined Networks

OpenSample: A Low-latency, Sampling-based Measurement Platform for Commodity SDN

A Low-Load QoS Routing Method for OpenFlow Networks

Software-Defined Networking (SDN) Overview

ANR-13-INFR-013 ANR DISCO

SoftRing: Taming the Reactive Model for Software Defined Networks

Scalable Enterprise Networks with Inexpensive Switches

Available online at ScienceDirect. Procedia Computer Science 34 (2014 )

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li

Introduction. Network Architecture Requirements of Data Centers in the Cloud Computing Era

Joint Allocation and Scheduling of Network Resource for Multiple Control Applications in SDN

Per-Packet Load Balancing in Data Center Networks

CS-580K/480K Advanced Topics in Cloud Computing. Software-Defined Networking

Leveraging Heterogeneity to Reduce the Cost of Data Center Upgrades

PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES

100 GBE AND BEYOND. Diagram courtesy of the CFP MSA Brocade Communications Systems, Inc. v /11/21

CS 6453 Network Fabric Presented by Ayush Dubey

Packet Scheduling in Data Centers. Lecture 17, Computer Networks (198:552)

Lecture 7: Data Center Networks

Scalable and Load-balanced Data Center Multicast

Data Centers. Tom Anderson

Enabling Wide-spread Communications on Optical Fabric with MegaSwitch

POLYMORPHIC ON-CHIP NETWORKS

DIBS: Just-in-time congestion mitigation for Data Centers

Scotch: Elastically Scaling up SDN Control-Plane using vswitch based Overlay

EL Wireless and Mobile Networking Spring 2002 Mid-Term Exam Solution - March 6, 2002

arxiv: v1 [cs.ni] 11 Oct 2016

lecture 18: network virtualization platform (NVP) 5590: software defined networking anduo wang, Temple University TTLMAN 401B, R 17:30-20:00

Investigating the Use of Synchronized Clocks in TCP Congestion Control

c-through: Part-time Optics in Data Centers

HyperFlow A Distributed Control Plane for OpenFlow

F10: A Fault- Tolerant Engineered Network

A Power Benchmarking Framework for Network Devices

A Scalability Study of Enterprise Network Architectures

A Hybrid Hierarchical Control Plane for Software-Defined Network

A Power Benchmarking Framework for Network Devices

Fast and Accurate Load Balancing for Geo-Distributed Storage Systems

DETOUR: A Large-Scale Non-Blocking Optical Data Center Fabric

Quantifying Violations of Destination-based Forwarding on the Internet

QoS on Low Bandwidth High Delay Links. Prakash Shende Planning & Engg. Team Data Network Reliance Infocomm

Responsive Multipath TCP in SDN-based Datacenters

Towards High-performance Flow-level level Packet Processing on Multi-core Network Processors

Time and Timestamping in Softwarized Environments

COCONUT: Seamless Scale-out of Network Elements

Performance Analysis of SDN Switches with Hardware and Software Flow Tables

SmartNIC Programming Models

Software-Defined Networking (Continued)

L19 Data Center Network Architectures

EXPERIENCES EVALUATING DCTCP. Lawrence Brakmo, Boris Burkov, Greg Leclercq and Murat Mugan Facebook

Linux Plumbers Conference TCP-NV Congestion Avoidance for Data Centers

Transcription:

DevoFlow: Scaling Flow Management for High-Performance Networks Andy Curtis Jeff Mogul Jean Tourrilhes Praveen Yalagandula Puneet Sharma Sujata Banerjee

Software-defined networking

Software-defined networking Enables programmable networks

Software-defined networking Enables programmable networks Implemented by OpenFlow

Software-defined networking Enables programmable networks Implemented by OpenFlow OpenFlow is a great concept, but... - its original design imposes excessive overheads

Traditional switch Control-plane Data-plane

Traditional switch Control-plane Inbound packets Data-plane Routed packets

Traditional switch Reachability Reachability Control-plane Inbound packets Data-plane Routed packets

Control-plane Centralized controller Inbound packets OpenFlow switch Routed packets Data-plane

Control-plane Centralized controller Flow setups Link state Forwarding rule stats Inbound packets Forwarding table entries Statistics requests Routed packets Data-plane

OpenFlow enables innovative management solutions

OpenFlow enables innovative management solutions Consistent routing and security policy enforcement [Ethane, SIGCOMM 2007] Data center network architectures like VL2 and PortLand [Tavakoli et al. Hotnets 2009] Client load-balancing with commodity switches [Aster*x, ACLD demo 2010; Wang et al., HotICE 2011] Flow scheduling [Hedera, NSDI 2010] Energy-proportional networking [ElasticTree, NSDI 2010] Automated data center QoS [Kim et al., INM/WREN 2010]

But OpenFlow is not perfect... Scaling these solutions to data centersized networks is challenging

Contributions Characterize overheads of implementing OpenFlow in hardware Propose DevoFlow to enable costeffective, scalable flow management Evaluate DevoFlow by applying it to data center flow scheduling

Contributions Characterize overheads of implementing OpenFlow in hardware Experience drawn from implementing OpenFlow on HP ProCurve switches

OpenFlow couples flow Central controller setup with visibility... Core... Aggregation Edge switches

Central controller... Core... Aggregation Edge switches Flow arrival

If no forwarding table rule at switch Central controller (exact-match or wildcard)... Core... Aggregation Edge switches

Central controller Two problems arise...... Core... Aggregation Edge switches

problem 1: bottleneck at controller Central controller......

problem 1: bottleneck at controller Up to 10 million new flows per second in data center with 100 edge switches [Benson et al. IMC 2010] Central controller......

problem 1: bottleneck at controller Up to 10 million new flows per second in data center with 100 edge switches [Benson et al. IMC 2010] If controller can handle 30K flow setups/ Central controller sec. then, we need at least 333 controllers!......

Onix [Koponen et al. OSDI 2010] Maestro [Cai et al. Tech Report 2010] HyperFlow [Tootoonchian and Ganjali, WREN 2010] Devolved controller [Tam et al. WCC 2011] Central controller......

problem 2: stress on switch control-plane Central controller......

Control-plane Inbound packets Routed packets Data-plane

Control-plane Switch control-plane Switch CPU Inbound packets Routed packets ASIC Data-plane

Control-plane Switch control-plane Switch CPU Inbound packets Routed packets ASIC Data-plane

Scaling problem: switches Inherent overheads Implementation-imposed overheads

Scaling problem: switches Inherent overheads - Bandwidth OpenFlow creates too much control traffic ~1 control packet for every 2 3 data packets - Latency Implementation-imposed overheads

Scaling problem: switches Inherent overheads Implementation-imposed overheads - Flow setup - Statistics gathering - State size (see paper)

Flow setup OpenFlow controller Client A Client B ProCurve 5406 zl switch

Flow setup OpenFlow controller Client A Client B ProCurve 5406 zl switch We believe our measurement numbers are representative of the current generation of OpenFlow switches

Flow setup 300 275 Flow setups per sec. 250 200 150 100 50 0 5406 zl

Flow setup Flow setups per sec. 10000 9000 8000 7000 6000 5000 4000 3000 2000 1000 10,000 We can expect up to 10K flow arrivals / sec. [Benson et al. IMC 2010] 40x difference! 0 5406 zl Expected

Flow setup 10000 10,000 Flow setups per sec. 9000 8000 7000 6000 5000 4000 3000 2000 1000 Too much latency: adds 2ms to flow setup 0 5406 zl Expected

Control-plane 300 Gbps Switch CPU Inbound packets Data-plane Routed packets

Control-plane 80 Mbps Switch CPU Inbound packets Data-plane Routed packets

Control-plane 17 Mbps Switch CPU Inbound packets Data-plane Routed packets

Stats-gathering Flow setups and stat-pulling compete for this bandwidth

Stats-gathering Flow setups and stat-pulling compete for this bandwidth 300 Flow setups per sec. 250 200 150 100 50 0 Stat-pull frequency: Never 1 s 500 ms

Stats-gathering Flow setups and stat-pulling compete for this bandwidth - 2.5 sec. to collect stats from the average data center edge switch

Can we solve the problem with more hardware? Faster CPU may help, but won t be enough - Control-plane datapath needs at least two orders of magnitude more bandwidth Ethernet speeds accelerating faster than CPU speeds OpenFlow won t drive chip-area budgets for several generations

Contributions Characterize overheads of implementing OpenFlow in hardware Propose DevoFlow to enable costeffective, scalable flow management Evaluate DevoFlow by applying it to data center flow scheduling

Devolved OpenFlow We devolve control over most flows back to the switches

DevoFlow design Keep flows in the data-plane Maintain just enough visibility for effective flow management Simplify the design and implementation of high-performance switches

DevoFlow mechanisms Control mechanisms Statistics-gathering mechanisms

DevoFlow mechanisms Control mechanisms - Rule cloning ASIC clones a wildcard rule as an exact match rule for new microflows

DevoFlow mechanisms Control mechanisms - Rule cloning wildcard rules src dst src port dst Port * 129.100.1.5 * * exact-match rules src dst src port dst Port

DevoFlow mechanisms Control mechanisms - Rule cloning wildcard rules src dst src port dst Port * 129.100.1.5 * * exact-match rules src dst src port dst Port

DevoFlow mechanisms Control mechanisms - Rule cloning wildcard rules src dst src port dst Port * 129.100.1.5 * * exact-match rules src dst src port dst Port

DevoFlow mechanisms Control mechanisms - Rule cloning wildcard rules src dst src port dst Port * 129.100.1.5 * * exact-match rules src dst src port dst Port 129.200.1.1 129.100.1.5 4832 80

DevoFlow mechanisms Control mechanisms - Rule cloning ASIC clones a wildcard rule as an exact match rule for new microflows - Local actions Rapid re-routing Gives fallback paths for when a port fails Multipath support

Control-mechanisms Control mechanisms - Rule cloning ASIC clones a wildcard rule as an exact match rule for new microflows - Local actions Rapid re-routing 1/3 1/6 1/2 Gives fallback paths for when a port fails Multipath support

Statistics-gathering mechanisms

Statistics-gathering mechanisms Sampling - Packet header is sent to controller with 1/1000 probability

Statistics-gathering mechanisms Sampling - Packet header is sent to controller with 1/1000 probability Triggers and reports - Can set a threshold per rule; when threshold is reached, flow is setup at central controller

Statistics-gathering mechanisms Sampling - Packet header is sent to controller with 1/1000 probability Triggers and reports - Can set a threshold per rule; when threshold is reached, flow is setup at central controller Approximate counters - Tracks all flows matching a wildcard rule

Implementing DevoFlow Have not implemented in hardware Can reuse existing functional blocks for most mechanisms

Using DevoFlow Provides tools to scale your SDN application, but scaling is still a challenge Example: flow scheduling - Follows Hedera s approach [Al-Fares et al. NSDI 2010]

Using DevoFlow: flow scheduling

Using DevoFlow: flow scheduling Switches use multipath forwarding rules for new flows

Using DevoFlow: flow scheduling Switches use multipath forwarding rules for new flows Central controller uses sampling or triggers to detect elephant flows - Elephant flows are dynamically scheduled by the central controller - Uses a bin packing algorithm, see paper

Evaluation How much can we reduce flow scheduling overheads while still achieving high performance?

Evaluation: methodology Custom built simulator - Flow-level model of network traffic - Models OpenFlow based on our measurements of the 5406 zl

Evaluation: methodology Custom built simulator - Flow-level model of network traffic - Models OpenFlow based on our measurements of the 5406 zl 17 Mbps

Evaluation: methodology Clos topology - 1600 servers - 640 Gbps bisection bandwidth - 20 servers per rack HyperX topology - 1620 servers - 405 Gbps bisection bandwidth - 20 servers per rack......

Evaluation: methodology Workloads - Shuffle, 128 MB to all servers, five at a time - Reverse-engineered MSR workload [Kandula et al. IMC 2009]

Evaluation: methodology Workloads - Shuffle, 128 MB to all servers, five at a time - Reverse-engineered MSR workload [Kandula et al. IMC 2009] Based on two distributions: inter-arrival times and bytes per flow

Evaluation: methodology Schedulers - ECMP - OpenFlow Course-grained using wildcard rules Fine-grained using stat-pulling (i.e., Hedera [Al-Fares et al. NSDI 2010]) - DevoFlow Statistics via sampling Triggers and reports at a specified threshold of bytes transferred

Evaluation: metrics Performance - Aggregate throughput Overheads - Packets/sec. to central controller - Forwarding table size

300 Performance: Clos topology Shuffle with 400 servers Aggregate Throughput (Gbps) 250 200 150 100 50 0 ECMP Wildcard Stat-pulling Sampling Thresholds OpenFlow-based DevoFlow-based

Performance: Clos topology Shuffle with 400 servers Aggregate Throughput (Gbps) 300 250 200 150 100 50 37% increase 0 ECMP Wildcard Stat-pulling Sampling Thresholds OpenFlow-based DevoFlow-based

Performance: Clos topology 250 MSR workload Aggregate throughput (Gbps) 200 150 100 50 0 ECMP Wildcard Stat-pulling Sampling Thresholds OpenFlow-based DevoFlow-based

Performance: HyperX topology 240 Shuffle with 400 servers Aggregate Throughput (Gbps) 200 160 120 80 40 0 ECMP VLB Stat-pulling Sampling Thresholds OpenFlow-based DevoFlow-based

Performance: HyperX topology Aggregate Throughput (Gbps) 240 200 160 120 80 40 Shuffle with 400 servers 55% increase 0 ECMP VLB Stat-pulling Sampling Thresholds OpenFlow-based DevoFlow-based

Performance: HyperX topology Aggregate Throughput (Gbps) 240 200 160 120 80 40 Shuffle with 400 servers 55% increase 0 ECMP VLB Stat-pulling Sampling Thresholds OpenFlow-based DevoFlow-based

Performance: HyperX topology Aggregate Throughput (Gbps) 240 200 160 120 80 40 Shuffle with 400 servers 55% increase 0 ECMP VLB Stat-pulling Sampling Thresholds OpenFlow-based DevoFlow-based

Performance: HyperX topology Shuffle + MSR workload 400 Aggregate throughput (Gbps) 350 300 250 200 150 100 50 0 ECMP VLB Stat-pulling Sampling Thresholds OpenFlow-based DevoFlow-based

Aggregate throughput (Gbps) 400 350 300 250 200 150 100 50 0 Performance: HyperX topology Shuffle + MSR workload 18% increase ECMP VLB Stat-pulling Sampling Thresholds OpenFlow-based DevoFlow-based

Overheads: Control traffic 8000 7758 No. packets / sec. to controller 6000 4000 2000 0 705 483 74 Wildcard Stats-pulling Sampling Threshold OpenFlow-based DevoFlow-based

Overheads: Control traffic 8000 7758 No. packets / sec. to controller 6000 4000 2000 0 10% 1% 705 483 74 Wildcard Stats-pulling Sampling Threshold OpenFlow-based DevoFlow-based

Overheads: Control traffic No. packets / sec. to controller 8000 6000 4000 2000 0 7758 Multipath wildcard rules and targeted stats collection reduces control traffic 10% 1% 705 483 74 Wildcard Stats-pulling Sampling Threshold OpenFlow-based DevoFlow-based

Overheads: Flow table entries 1000 926 Forwarding table entries 750 500 250 0 71 6 13 Wildcard Stats-pulling Sampling Threshold OpenFlow-based DevoFlow-based

Overheads: Flow table entries 1000 926 Forwarding table entries 750 500 250 0 70 150x reduction in table entries at average edge switch 71 8 15 Wildcard Stats-pulling Sampling Threshold OpenFlow-based DevoFlow-based

Evaluation: overheads Control-plane bandwidth needed Control plane throughput needed (Mbps) 250 200 150 100 50 95th percentile 99th percentile 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Percent of forwarding bandwidth needed for control plane bandwidth 0 0.1 0.5 1 10 30 Never Stat-pulling rate (s) 0

Evaluation: overheads Control-plane bandwidth needed Control plane throughput needed (Mbps) 250 200 150 100 50 5406zl bandwidth 95th percentile 99th percentile 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Percent of forwarding bandwidth needed for control plane bandwidth 0 0.1 0.5 1 10 30 Never Stat-pulling rate (s) 0

Evaluation: overheads Control-plane bandwidth needed Control plane throughput needed (Mbps) 250 200 150 100 50 5406zl bandwidth 95th percentile 99th percentile This is to meet a 2ms switchinternal queuing deadline 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Percent of forwarding bandwidth needed for control plane bandwidth 0 0.1 0.5 1 10 30 Never Stat-pulling rate (s) 0

Conclusions OpenFlow imposes high overheads on switches Proposed DevoFlow to give tools to reduce reliance on the control-plane DevoFlow can reduce overheads by 10 50x for data center flow scheduling

Other uses of DevoFlow Client load-balancing (Similar to Wang et al. HotICE 2011) Network virtualization [Sherwood et al. OSDI 2010] Data center QoS Multicast Routing as a service [Chen et al. INFOCOM 2011] Energy-proportional routing [Heller et al. NSDI 2010]

Implementing DevoFlow Rule cloning: - May be difficult to implement on ASIC. Can definitely be done with use of switch CPU Multipath support: - Similar to LAG and ECMP Sampling: - Already implemented in most switches Triggers: - Similar to rate limiters

Flow table size Constrained resource - Commodity switches: 32K 64K exact match entries ~1500 TCAM entries Virtualization may strain table size - 10s of VMs per machine implies >100K table entries

8 Time to pull statistics (s) 6 4 2 Average reply time Maximum reply time 0 0 5000 10000 15000 20000 25000 30000 35000 Flow table size (flows)

8 Time to pull statistics (s) 6 4 2.5 sec. to pull stats at average edge switch 2 Average reply time Maximum reply time 0 0 5000 10000 15000 20000 25000 30000 35000 Flow table size (flows)

300 TCP connections rate (sockets / s) 250 200 150 100 50 TCP connection rate Stats collected 8000 6000 4000 2000 Stats collected (entries / s) 0 0 0 1 2 3 4 5 6 7 8 9 10 Stats request rate (req / s)

Evaluation: methodology Workloads - Reverse-engineered MSR workload [Kandula et al. IMC 2009] Inter-arrival times Flow sizes

Evaluation: performance Clos topology Aggregate Throughput (Gbps) 600 500 400 300 200 100 0 ECMP Wildcard 1s Pull-based 5s Sampling 1/1000 Threshold 1MB OpenFlow-based DevoFlow-based shu e, n=400 MSR, 25% MSR, 75% MSR + shu e, inter-rack inter-rack n = 400 MSR + shu e, n = 800 Workload

Evaluation: overheads Control traffic No. packets / sec. to controller 30000 25000 20000 15000 10000 5000 0 504 483 446 0.1s 1s 10s 29,451 7,758 7,123 4,871 0.1s 1s 10s 1/100 709 71 432 181 74 1/1000 1/10000 MSR, 25% inter-rack MSR, 75% inter-rack 128KB 1MB 10MB Wildcard Pull-based Sampling Threshold OpenFlow schedulers DevoFlow schedulers OpenFlow-based DevoFlow-based

Evaluation: overheads Flow table entries No. ow table entries 1800 1600 1400 1200 1000 800 600 400 200 0 50x 10x Avg - MSR, 25% inter-rack Max - MSR, 25% inter-rack Avg - MSR, 75% inter-rack Max - MSR, 75% inter-rack 0.1s 1s 10s 0.1s 1s 10s 1/100 1/1000 1/10000 128KB 1MB 10MB Wildcard Pull-based Sampling Threshold OpenFlow schedulers OpenFlow-based DevoFlow schedulers DevoFlow-based DevoFlow aggressively uses multipath wildcard rules