Micro load balancing in data centers with DRILL

Similar documents
CONGA: Distributed Congestion-Aware Load Balancing for Datacenters

DevoFlow: Scaling Flow Management for High-Performance Networks

Per-Packet Load Balancing in Data Center Networks

Let It Flow Resilient Asymmetric Load Balancing with Flowlet Switching

6.888: Lecture 4 Data Center Load Balancing

Building Efficient and Reliable Software-Defined Networks. Naga Katta

DiffFlow: Differentiating Short and Long Flows for Load Balancing in Data Center Networks

Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks

Mahout: Low-Overhead Datacenter Traffic Management using End-Host-Based Elephant Detection. Vasileios Dimitrakis

ALB: Adaptive Load Balancing Based on Accurate Congestion Feedback for Asymmetric Topologies

Advanced Computer Networks. Datacenter TCP

Utilizing Datacenter Networks: Centralized or Distributed Solutions?

DIBS: Just-in-time congestion mitigation for Data Centers

A NEW APPROACH TO STOCHASTIC SCHEDULING IN DATA CENTER NETWORKS

Concise Paper: Freeway: Adaptively Isolating the Elephant and Mice Flows on Different Transmission Paths

DATA center networks use multi-rooted Clos topologies

CLOVE: How I Learned to Stop Worrying About the Core and Love the Edge

Data Center Networks. Brighten Godfrey CS 538 April Thanks to Ankit Singla for some slides in this lecture

CS 6453 Network Fabric Presented by Ayush Dubey

Packet-Based Load Balancing in Data Center Networks

DFFR: A Distributed Load Balancer for Data Center Networks

Jellyfish. networking data centers randomly. Brighten Godfrey UIUC Cisco Systems, September 12, [Photo: Kevin Raskoff]

Routing Domains in Data Centre Networks. Morteza Kheirkhah. Informatics Department University of Sussex. Multi-Service Networks July 2011

Advanced Computer Networks. Datacenter TCP

Alizadeh, M. et al., " CONGA: distributed congestion-aware load balancing for datacenters," Proc. of ACM SIGCOMM '14, 44(4): , Oct

From Routing to Traffic Engineering

Waze: Congestion-Aware Load Balancing at the Virtual Edge for Asymmetric Topologies

L19 Data Center Network Architectures

Responsive Multipath TCP in SDN-based Datacenters

Optical Packet Switching

Cloud networking (VITMMA02) DC network topology, Ethernet extensions

Probe or Wait : Handling tail losses using Multipath TCP

DeTail Reducing the Tail of Flow Completion Times in Datacenter Networks. David Zats, Tathagata Das, Prashanth Mohan, Dhruba Borthakur, Randy Katz

Fastpass A Centralized Zero-Queue Datacenter Network

Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter. Glenn Judd Morgan Stanley

Delay Controlled Elephant Flow Rerouting in Software Defined Network

Towards Fully Synchronized (and Programmable) Datacenter Networks

c-through: Part-time Optics in Data Centers

Jellyfish: Networking Data Centers Randomly

Cloud 3.0 and Software Defined Networking October 28, Amin Vahdat on behalf of Google Technical Infratructure Google Fellow

TCP improvements for Data Center Networks

SOFTWARE DEFINED NETWORKS. Jonathan Chu Muhammad Salman Malik

Optimal Network Flow Allocation. EE 384Y Almir Mutapcic and Primoz Skraba 27/05/2004

T9: SDN and Flow Management: DevoFlow

COCONUT: Seamless Scale-out of Network Elements

DARD: A Practical Distributed Adaptive Routing Architecture for Datacenter Networks

Forwarding Architecture

CONGA: Distributed Congestion-Aware Load Balancing for Datacenters

Machine-Learning-Based Flow scheduling in OTSSenabled

DETOUR: A Large-Scale Non-Blocking Optical Data Center Fabric

Episode 5. Scheduling and Traffic Management

Data Center TCP (DCTCP)

GRIN: Utilizing the Empty Half of Full Bisection Networks

arxiv: v1 [cs.ni] 11 Oct 2016

Distributed Multipath Routing for Data Center Networks based on Stochastic Traffic Modeling

B4 and After: Managing Hierarchy, Partitioning, and Asymmetry for Availability and Scale in Google's Software-Defined WAN

15-744: Computer Networking. Overview. Queuing Disciplines. TCP & Routers. L-6 TCP & Routers

Dynamic Distributed Flow Scheduling with Load Balancing for Data Center Networks

On the Efficacy of Fine-Grained Traffic Splitting Protocols in Data Center Networks

Transport Layer (Congestion Control)

Improving Datacenter Network Performance via Intelligent Network Edge. Keqiang He

A Scalable, Commodity Data Center Network Architecture

STMS: Improving MPTCP Throughput Under Heterogenous Networks

THE increasing reliance on streaming mobile video from

CONGA: Distributed Congestion-Aware Load Balancing for Datacenters

Congestion Control In the Network

A Measurement Study on Multi-path TCP with Multiple Cellular Carriers on High Speed Rails

POLYMORPHIC ON-CHIP NETWORKS

CS 268: Computer Networking

Multipath Transport, Resource Pooling, and implications for Routing

Jupiter Rising: A Decade of Clos Topologies and Centralized Control in

THE increasing reliance on streaming mobile video from

Pre-eminent Multi-path routing with maximum resource utilization in traffic engineering approach

Titan: Fair Packet Scheduling for Commodity Multiqueue NICs. Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift July 13 th, 2017

Beyond fat-trees without antennae, mirrors, and disco-balls. Simon Kassing, Asaf Valadarsky, Gal Shahaf, Michael Schapira, Ankit Singla

Leveraging Multi-path Diversity for Transport Loss Recovery in Data Centers

Scalability and Resilience in SDN: an overview. Nicola Rustignoli

Congestion Control in Datacenters. Ahmed Saeed

XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet

Advanced Computer Networks Exercise Session 7. Qin Yin Spring Semester 2013

Baidu s Best Practice with Low Latency Networks

RDMA over Commodity Ethernet at Scale

Requirement Discussion of Flow-Based Flow Control(FFC)

DATA CENTER FABRIC COOKBOOK

Scaling Hardware Accelerated Network Monitoring to Concurrent and Dynamic Queries with *Flow

Lecture 7: Data Center Networks

Low Latency via Redundancy

Outlines. Introduction (Cont d) Introduction. Introduction Network Evolution External Connectivity Software Control Experience Conclusion & Discussion

Duet: Cloud Scale Load Balancing with Hardware and Software

F10: A Fault- Tolerant Engineered Network

Hardware Evolution in Data Centers

Optimizing Performance: Intel Network Adapters User Guide

Packet Scheduling in Data Centers. Lecture 17, Computer Networks (198:552)

Multipath Routing Protocol for Congestion Control in Mobile Ad-hoc Network

15-744: Computer Networking. Data Center Networking II

Spotlight: Scalable Transport Layer Load Balancing for Data Center Networks

Cutting the Cord: A Robust Wireless Facilities Network for Data Centers

Introduction. Network Architecture Requirements of Data Centers in the Cloud Computing Era

WCMP: Weighted Cost Multipathing for Improved Fairness in Data Centers

On low-latency-capable topologies, and their impact on the design of intra-domain routing

Transcription:

Micro load balancing in data centers with DRILL Soudeh Ghorbani (UIUC) Brighten Godfrey (UIUC) Yashar Ganjali (University of Toronto) Amin Firoozshahian (Intel)

Where should the load balancing functionality live?

Why load balancing in data centers?

Data center apps have demanding network requirements.

Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google s Datacenter Network, Google, SIGCOMM 2015. Data center topologies provide high capacity. Core 10.4.0.1 10.4.1.1 10.4.3.1 Aggregation 10.0.2.1 10.0.3.1 10.2.2.1 Edge 10.0.1.1 10.2.0.1 10.0.1.2 10.0.1.3 Pod 0 10.2.0.2 10.2.0.3 Pod 1 Pod 2 Pod 3 VL2: a scalable and flexible data center network, C. Kim et al., SIGCOMM 2009 A scalable, commodity data center network architecture, M. Al-Fares et al., SIGCOMM 2008 Jellyfish: Networking Data Centers Randomly., A. Singla et al., NSDI 2012

But we are still not using the capacity efficiently! Networks experience high congestion drops as utilization approached 25% [1]. Further improving fabric congestion response remains an ongoing effort [1]. [1] Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google s Datacenter Network, Google, SIGCOMM 2015.

The gap: High bandwidth provided via massive multipathing. Balancing load among many paths in real time seems too hard for our fast and dumb data center fabric. Congestion happens even when there is spare capacity to mitigate it elsewhere [2]. [2] Network Traffic Characteristics of Data Centers in the Wild, T. Benson et al., IMC 2010.

ECMP is not the answer. Select among equal-cost paths by a hash of 5-tuple Problems: Coarse grained Stateless and local A B D C

Rethinking the problem:... Hedera [NSDI'10] Mahout [INFOCOM'11] FastPass [SIGCOMM'14] Plank [SIGCOMM'14] Presto [SIGCOMM'15] MPTCP [NSDI 11] CONGA [SIGCOMM'14]

Rethinking the problem Let s move the load balancing functionality out of the core!

Central Controller Moving LB from fabric to:... Hedera [NSDI'10] Mahout [INFOCOM'11] FastPass [SIGCOMM'14] Planck [SIGCOMM'14] Presto [SIGCOMM'15] MPTCP [NSDI 11] CONGA [SIGCOMM'14] Controller

Hosts Moving LB from fabric to:... Hedera [NSDI'10] Mahout [INFOCOM'11] FastPass [SIGCOMM'14] Planck [SIGCOMM'14] Presto [SIGCOMM'15] MPTCP [NSDI 11] CONGA [SIGCOMM'14] Controller

Edge Moving LB from fabric to:... Hedera [NSDI'10] Mahout [INFOCOM'11] FastPass [SIGCOMM'14] Planck [SIGCOMM'14] Presto [SIGCOMM'15] MPTCP [NSDI 11] CONGA [SIGCOMM'14] Controller

Packet LB granularity Flowcell/ flowlet Flow Micro bursts Elephant flow Scalable LB design space Hedera Planck ECMP CONGA Presto O(µs) O(1ms) O(10ms) O(100ms) O(1s) Oblivious Control loop latency

Packet LB granularity Flowcell/ flowlet Flow Elephant flow Micro load balancing Microsecond timescales Microscopic, i.e., local to each switch port O(µs) O(1ms) O(10ms) O(100ms) O(1s) Control loop latency

Micro LB A plausible architecture Symmetric topologies dst port A 4,5,6 B 4,5,6 C 4,5,6 Fabric 1. Discover topology. 2. Compute multiple shortest paths. 3. Install into FIB ~ ECMP, so far...

Inside a switch dst port A 4,5,6,7 B 4,5,6,7 C 4,5,6

The power of 2 choices Random n bins and n balls Each ball choosing a bin independently and uniformly at random Max load: Optimal log n θ( log log n ) Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal. Balanced allocations. SIAM Journal on Computing, 1994.

The power of 2 choices n bins and n balls Balls placed sequentially, in the least loaded of d 2 random bins Max load: log log n θ( log d ) d=1 d=2 Optimal Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal. Balanced allocations. SIAM Journal on Computing, 1994.

What we want: Queues instead of bins. Each ball chooses a bin independently, no coordination.

What we want: Queues instead of bins. Each ball chooses a bin independently, no coordination. dropped

Simulation methodology OMNET++, INET framework Linux 2.6 TCP Leaf/spine topologies Datacenter traces from DevoFlow [SIGCOMM 11]

Queue lengths [STDV] The pitfalls of choice d=1 Balls and bins d=2 d=40 Setting parameters Stability d

Packet LB granularity Flowcell/ flowlet Flow Elephant flow DRILL Decentralized Randomized In-network Local Load balancing O(µs) O(1ms) O(10ms) O(100ms) O(1s) Control loop latency

Substantial improvement over prior work.

Substantial improvement over prior work. An incast application

Thou shalt not split flows! Splitting flows Reordering Throughput degradation

Splitting flows Reordering Throughput degradation DRILL splits flows along many paths USED PATHS 120% 100% 97% 98% 80% 73% 60% 40% 20% 0% DRILL Presto Random

Splitting flows Reordering Throughput degradation DRILL splits flows along many paths yet causes little reordering! USED PATHS OUT OF ORDER PACKETS 120% 100% 97% 98% 0.30% 0.25% 0.24% 80% 73% 0.20% 60% 0.15% 40% 0.10% 20% 0.05% 0.02% 0.04% 0% DRILL Presto Random 0.00% DRILL Presto Random

Splitting flows Reordering Throughput degradation DRILL splits flows along many paths yet causes little reordering! USED PATHS BUFFERING DELAY [MICRO SEC] 120% 25 22 100% 97% 98% 20 17 80% 73% 15 60% 10 40% 5 5 20% 0% DRILL Presto Random 0 DRILL Presto Random

Splitting flows Latency variance Reordering Throughput degradation

Splitting flows Latency variance Reordering Throughput degradation 00% 0% DRILL splits flows along many paths U S E D PAT H S 97% 73% 98% DRILL Presto Random 1800 1600 1400 1200 1000 800 600 400 200 0 1.22 but has low queueing delay variance QUEUE LENGTH [ STDV] 1671.21 and therefore causes little reordering! 295.73 DRILL Presto Random Insight: Queueing delay variance is so small that it doesn t matter what path the packet takes. 30 20 10 0 B U F F E R I NG D E L AY [ M I C R O S E C ] 5 22 17 DRILL Presto Random

Ongoing and future work Efficient handling of asymmetry Failures Irregular topologies with non-equal cost paths Converged Ethernet

Micro Load Balancing with DRILL: Conclusion Microscopic, microsecond decisions yield lowest latency load balancing Splitting flows is splitting hairs Strong candidate for augmenting data center switching hardware