Micro load balancing in data centers with DRILL

Size: px
Start display at page:

Download "Micro load balancing in data centers with DRILL"

Transcription

1 Micro load balancing in data centers with DRILL Soudeh Ghorbani (UIUC) Brighten Godfrey (UIUC) Yashar Ganjali (University of Toronto) Amin Firoozshahian (Intel)

2 Where should the load balancing functionality live?

3 Why load balancing in data centers?

4 Data center apps have demanding network requirements.

5 Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google s Datacenter Network, Google, SIGCOMM Data center topologies provide high capacity. Core Aggregation Edge Pod Pod 1 Pod 2 Pod 3 VL2: a scalable and flexible data center network, C. Kim et al., SIGCOMM 2009 A scalable, commodity data center network architecture, M. Al-Fares et al., SIGCOMM 2008 Jellyfish: Networking Data Centers Randomly., A. Singla et al., NSDI 2012

6 But we are still not using the capacity efficiently! Networks experience high congestion drops as utilization approached 25% [1]. Further improving fabric congestion response remains an ongoing effort [1]. [1] Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google s Datacenter Network, Google, SIGCOMM 2015.

7 The gap: High bandwidth provided via massive multipathing. Balancing load among many paths in real time seems too hard for our fast and dumb data center fabric. Congestion happens even when there is spare capacity to mitigate it elsewhere [2]. [2] Network Traffic Characteristics of Data Centers in the Wild, T. Benson et al., IMC 2010.

8 ECMP is not the answer. Select among equal-cost paths by a hash of 5-tuple Problems: Coarse grained Stateless and local A B D C

9 Rethinking the problem:... Hedera [NSDI'10] Mahout [INFOCOM'11] FastPass [SIGCOMM'14] Plank [SIGCOMM'14] Presto [SIGCOMM'15] MPTCP [NSDI 11] CONGA [SIGCOMM'14]

10 Rethinking the problem Let s move the load balancing functionality out of the core!

11 Central Controller Moving LB from fabric to:... Hedera [NSDI'10] Mahout [INFOCOM'11] FastPass [SIGCOMM'14] Planck [SIGCOMM'14] Presto [SIGCOMM'15] MPTCP [NSDI 11] CONGA [SIGCOMM'14] Controller

12 Hosts Moving LB from fabric to:... Hedera [NSDI'10] Mahout [INFOCOM'11] FastPass [SIGCOMM'14] Planck [SIGCOMM'14] Presto [SIGCOMM'15] MPTCP [NSDI 11] CONGA [SIGCOMM'14] Controller

13 Edge Moving LB from fabric to:... Hedera [NSDI'10] Mahout [INFOCOM'11] FastPass [SIGCOMM'14] Planck [SIGCOMM'14] Presto [SIGCOMM'15] MPTCP [NSDI 11] CONGA [SIGCOMM'14] Controller

14 Packet LB granularity Flowcell/ flowlet Flow Micro bursts Elephant flow Scalable LB design space Hedera Planck ECMP CONGA Presto O(µs) O(1ms) O(10ms) O(100ms) O(1s) Oblivious Control loop latency

15 Packet LB granularity Flowcell/ flowlet Flow Elephant flow Micro load balancing Microsecond timescales Microscopic, i.e., local to each switch port O(µs) O(1ms) O(10ms) O(100ms) O(1s) Control loop latency

16 Micro LB A plausible architecture Symmetric topologies dst port A 4,5,6 B 4,5,6 C 4,5,6 Fabric 1. Discover topology. 2. Compute multiple shortest paths. 3. Install into FIB ~ ECMP, so far...

17 Inside a switch dst port A 4,5,6,7 B 4,5,6,7 C 4,5,6

18 The power of 2 choices Random n bins and n balls Each ball choosing a bin independently and uniformly at random Max load: Optimal log n θ( log log n ) Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal. Balanced allocations. SIAM Journal on Computing, 1994.

19 The power of 2 choices n bins and n balls Balls placed sequentially, in the least loaded of d 2 random bins Max load: log log n θ( log d ) d=1 d=2 Optimal Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal. Balanced allocations. SIAM Journal on Computing, 1994.

20 What we want: Queues instead of bins. Each ball chooses a bin independently, no coordination.

21 What we want: Queues instead of bins. Each ball chooses a bin independently, no coordination. dropped

22 Simulation methodology OMNET++, INET framework Linux 2.6 TCP Leaf/spine topologies Datacenter traces from DevoFlow [SIGCOMM 11]

23 Queue lengths [STDV] The pitfalls of choice d=1 Balls and bins d=2 d=40 Setting parameters Stability d

24 Packet LB granularity Flowcell/ flowlet Flow Elephant flow DRILL Decentralized Randomized In-network Local Load balancing O(µs) O(1ms) O(10ms) O(100ms) O(1s) Control loop latency

25 Substantial improvement over prior work.

26 Substantial improvement over prior work. An incast application

27 Thou shalt not split flows! Splitting flows Reordering Throughput degradation

28 Splitting flows Reordering Throughput degradation DRILL splits flows along many paths USED PATHS 120% 100% 97% 98% 80% 73% 60% 40% 20% 0% DRILL Presto Random

29 Splitting flows Reordering Throughput degradation DRILL splits flows along many paths yet causes little reordering! USED PATHS OUT OF ORDER PACKETS 120% 100% 97% 98% 0.30% 0.25% 0.24% 80% 73% 0.20% 60% 0.15% 40% 0.10% 20% 0.05% 0.02% 0.04% 0% DRILL Presto Random 0.00% DRILL Presto Random

30 Splitting flows Reordering Throughput degradation DRILL splits flows along many paths yet causes little reordering! USED PATHS BUFFERING DELAY [MICRO SEC] 120% % 97% 98% % 73% 15 60% 10 40% % 0% DRILL Presto Random 0 DRILL Presto Random

31 Splitting flows Latency variance Reordering Throughput degradation

32 Splitting flows Latency variance Reordering Throughput degradation 00% 0% DRILL splits flows along many paths U S E D PAT H S 97% 73% 98% DRILL Presto Random but has low queueing delay variance QUEUE LENGTH [ STDV] and therefore causes little reordering! DRILL Presto Random Insight: Queueing delay variance is so small that it doesn t matter what path the packet takes B U F F E R I NG D E L AY [ M I C R O S E C ] DRILL Presto Random

33 Ongoing and future work Efficient handling of asymmetry Failures Irregular topologies with non-equal cost paths Converged Ethernet

34 Micro Load Balancing with DRILL: Conclusion Microscopic, microsecond decisions yield lowest latency load balancing Splitting flows is splitting hairs Strong candidate for augmenting data center switching hardware

CONGA: Distributed Congestion-Aware Load Balancing for Datacenters

CONGA: Distributed Congestion-Aware Load Balancing for Datacenters CONGA: Distributed Congestion-Aware Load Balancing for Datacenters By Alizadeh,M et al. Motivation Distributed datacenter applications require large bisection bandwidth Spine Presented by Andrew and Jack

More information

DevoFlow: Scaling Flow Management for High-Performance Networks

DevoFlow: Scaling Flow Management for High-Performance Networks DevoFlow: Scaling Flow Management for High-Performance Networks Andy Curtis Jeff Mogul Jean Tourrilhes Praveen Yalagandula Puneet Sharma Sujata Banerjee Software-defined networking Software-defined networking

More information

Per-Packet Load Balancing in Data Center Networks

Per-Packet Load Balancing in Data Center Networks Per-Packet Load Balancing in Data Center Networks Yagiz Kaymak and Roberto Rojas-Cessa Abstract In this paper, we evaluate the performance of perpacket load in data center networks (DCNs). Throughput and

More information

Let It Flow Resilient Asymmetric Load Balancing with Flowlet Switching

Let It Flow Resilient Asymmetric Load Balancing with Flowlet Switching Let It Flow Resilient Asymmetric Load Balancing with Flowlet Switching Erico Vanini*, Rong Pan*, Mohammad Alizadeh, Parvin Taheri*, Tom Edsall* * Load Balancing in Data Centers Multi-rooted tree 1000s

More information

6.888: Lecture 4 Data Center Load Balancing

6.888: Lecture 4 Data Center Load Balancing 6.888: Lecture 4 Data Center Load Balancing Mohammad Alizadeh Spring 2016 1 MoDvaDon DC networks need large bisection bandwidth for distributed apps (big data, HPC, web services, etc) Multi-rooted Single-rooted

More information

Building Efficient and Reliable Software-Defined Networks. Naga Katta

Building Efficient and Reliable Software-Defined Networks. Naga Katta FPO Talk Building Efficient and Reliable Software-Defined Networks Naga Katta Jennifer Rexford (Advisor) Readers: Mike Freedman, David Walker Examiners: Nick Feamster, Aarti Gupta 1 Traditional Networking

More information

DiffFlow: Differentiating Short and Long Flows for Load Balancing in Data Center Networks

DiffFlow: Differentiating Short and Long Flows for Load Balancing in Data Center Networks : Differentiating Short and Long Flows for Load Balancing in Data Center Networks Francisco Carpio, Anna Engelmann and Admela Jukan Technische Universität Braunschweig, Germany Email:{f.carpio, a.engelmann,

More information

Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks

Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks Expeditus: Congestion-Aware Load Balancing in Clos Data Center Networks Peng Wang, Hong Xu, Zhixiong Niu, Dongsu Han, Yongqiang Xiong ACM SoCC 2016, Oct 5-7, Santa Clara Motivation Datacenter networks

More information

Mahout: Low-Overhead Datacenter Traffic Management using End-Host-Based Elephant Detection. Vasileios Dimitrakis

Mahout: Low-Overhead Datacenter Traffic Management using End-Host-Based Elephant Detection. Vasileios Dimitrakis Mahout: Low-Overhead Datacenter Traffic Management using End-Host-Based Elephant Detection Vasileios Dimitrakis Vasileios Dimitrakis 2016-03-18 1 Introduction - Motivation (1) Vasileios Dimitrakis 2016-03-18

More information

ALB: Adaptive Load Balancing Based on Accurate Congestion Feedback for Asymmetric Topologies

ALB: Adaptive Load Balancing Based on Accurate Congestion Feedback for Asymmetric Topologies : Adaptive Load Balancing Based on Accurate Congestion Feedback for Asymmetric Topologies Qingyu Shi 1, Fang Wang 1,2, Dan Feng 1, and Weibin Xie 1 1 Wuhan National Laboratory for Optoelectronics, Key

More information

Advanced Computer Networks. Datacenter TCP

Advanced Computer Networks. Datacenter TCP Advanced Computer Networks 263 3501 00 Datacenter TCP Spring Semester 2017 1 Oriana Riva, Department of Computer Science ETH Zürich Today Problems with TCP in the Data Center TCP Incast TPC timeouts Improvements

More information

Utilizing Datacenter Networks: Centralized or Distributed Solutions?

Utilizing Datacenter Networks: Centralized or Distributed Solutions? Utilizing Datacenter Networks: Centralized or Distributed Solutions? Costin Raiciu Department of Computer Science University Politehnica of Bucharest We ve gotten used to great applications Enabling Such

More information

DIBS: Just-in-time congestion mitigation for Data Centers

DIBS: Just-in-time congestion mitigation for Data Centers DIBS: Just-in-time congestion mitigation for Data Centers Kyriakos Zarifis, Rui Miao, Matt Calder, Ethan Katz-Bassett, Minlan Yu, Jitendra Padhye University of Southern California Microsoft Research Summary

More information

A NEW APPROACH TO STOCHASTIC SCHEDULING IN DATA CENTER NETWORKS

A NEW APPROACH TO STOCHASTIC SCHEDULING IN DATA CENTER NETWORKS A NEW APPROACH TO STOCHASTIC SCHEDULING IN DATA CENTER NETWORKS ABSTRACT Tingqiu Tim Yuan, Tao Huang, Cong Xu and Jian Li Huawei Technologies, China The Quality of Service (QoS) of scheduling between latency-sensitive

More information

Concise Paper: Freeway: Adaptively Isolating the Elephant and Mice Flows on Different Transmission Paths

Concise Paper: Freeway: Adaptively Isolating the Elephant and Mice Flows on Different Transmission Paths 2014 IEEE 22nd International Conference on Network Protocols Concise Paper: Freeway: Adaptively Isolating the Elephant and Mice Flows on Different Transmission Paths Wei Wang,Yi Sun, Kai Zheng, Mohamed

More information

DATA center networks use multi-rooted Clos topologies

DATA center networks use multi-rooted Clos topologies : Sampling based Balancing in Data Center Networks Peng Wang, Member, IEEE, George Trimponias, Hong Xu, Member, IEEE, Yanhui Geng Abstract Data center networks demand high-performance, robust, and practical

More information

CLOVE: How I Learned to Stop Worrying About the Core and Love the Edge

CLOVE: How I Learned to Stop Worrying About the Core and Love the Edge CLOVE: How I Learned to Stop Worrying About the Core and Love the Edge Naga Katta *, Mukesh Hira, Aditi Ghag, Changhoon Kim, Isaac Keslassy, Jennifer Rexford * * Princeton University, VMware, Barefoot

More information

Data Center Networks. Brighten Godfrey CS 538 April Thanks to Ankit Singla for some slides in this lecture

Data Center Networks. Brighten Godfrey CS 538 April Thanks to Ankit Singla for some slides in this lecture Data Center Networks Brighten Godfrey CS 538 April 5 2017 Thanks to Ankit Singla for some slides in this lecture Introduction: The Driving Trends Cloud Computing Computing as a utility Purchase however

More information

CS 6453 Network Fabric Presented by Ayush Dubey

CS 6453 Network Fabric Presented by Ayush Dubey CS 6453 Network Fabric Presented by Ayush Dubey Based on: 1. Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google s Datacenter Network. Singh et al. SIGCOMM15. 2. Network Traffic

More information

Packet-Based Load Balancing in Data Center Networks

Packet-Based Load Balancing in Data Center Networks Packet-Based Load Balancing in Data Center Networks Yagiz Kaymak and Roberto Rojas-Cessa Networking Research Laboratory, Department of Electrical and Computer Engineering, New Jersey Institute of Technology,

More information

DFFR: A Distributed Load Balancer for Data Center Networks

DFFR: A Distributed Load Balancer for Data Center Networks DFFR: A Distributed Load Balancer for Data Center Networks Chung-Ming Cheung* Department of Computer Science University of Southern California Los Angeles, CA 90089 E-mail: chungmin@usc.edu Ka-Cheong Leung

More information

Jellyfish. networking data centers randomly. Brighten Godfrey UIUC Cisco Systems, September 12, [Photo: Kevin Raskoff]

Jellyfish. networking data centers randomly. Brighten Godfrey UIUC Cisco Systems, September 12, [Photo: Kevin Raskoff] Jellyfish networking data centers randomly Brighten Godfrey UIUC Cisco Systems, September 12, 2013 [Photo: Kevin Raskoff] Ask me about... Low latency networked systems Data plane verification (Veriflow)

More information

Routing Domains in Data Centre Networks. Morteza Kheirkhah. Informatics Department University of Sussex. Multi-Service Networks July 2011

Routing Domains in Data Centre Networks. Morteza Kheirkhah. Informatics Department University of Sussex. Multi-Service Networks July 2011 Routing Domains in Data Centre Networks Morteza Kheirkhah Informatics Department University of Sussex Multi-Service Networks July 2011 What is a Data Centre? Large-scale Data Centres (DC) consist of tens

More information

Advanced Computer Networks. Datacenter TCP

Advanced Computer Networks. Datacenter TCP Advanced Computer Networks 263 3501 00 Datacenter TCP Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015 1 Oriana Riva, Department of Computer Science ETH Zürich Last week Datacenter Fabric Portland

More information

Alizadeh, M. et al., " CONGA: distributed congestion-aware load balancing for datacenters," Proc. of ACM SIGCOMM '14, 44(4): , Oct

Alizadeh, M. et al.,  CONGA: distributed congestion-aware load balancing for datacenters, Proc. of ACM SIGCOMM '14, 44(4): , Oct CONGA Paper Review By Buting Ma and Taeju Park Paper Reference Alizadeh, M. et al., " CONGA: distributed congestion-aware load balancing for datacenters," Proc. of ACM SIGCOMM '14, 44(4):503-514, Oct.

More information

From Routing to Traffic Engineering

From Routing to Traffic Engineering 1 From Routing to Traffic Engineering Robert Soulé Advanced Networking Fall 2016 2 In the beginning B Goal: pair-wise connectivity (get packets from A to B) Approach: configure static rules in routers

More information

Waze: Congestion-Aware Load Balancing at the Virtual Edge for Asymmetric Topologies

Waze: Congestion-Aware Load Balancing at the Virtual Edge for Asymmetric Topologies Waze: Congestion-Aware Load Balancing at the Virtual Edge for Asymmetric Topologies Naga Katta Salesforce.com Aran Bergman Technion Aditi Ghag, Mukesh Hira VMware Changhoon Kim Barefoot Networks Isaac

More information

L19 Data Center Network Architectures

L19 Data Center Network Architectures L19 Data Center Network Architectures by T.S.R.K. Prasad EA C451 Internetworking Technologies 27/09/2012 References / Acknowledgements [Feamster-DC] Prof. Nick Feamster, Data Center Networking, CS6250:

More information

Responsive Multipath TCP in SDN-based Datacenters

Responsive Multipath TCP in SDN-based Datacenters Responsive Multipath TCP in SDN-based Datacenters Jingpu Duan, Zhi Wang, Chuan Wu Department of Computer Science, The University of Hong Kong, Email: {jpduan,cwu}@cs.hku.hk Graduate School at Shenzhen,

More information

Optical Packet Switching

Optical Packet Switching Optical Packet Switching DEISNet Gruppo Reti di Telecomunicazioni http://deisnet.deis.unibo.it WDM Optical Network Legacy Networks Edge Systems WDM Links λ 1 λ 2 λ 3 λ 4 Core Nodes 2 1 Wavelength Routing

More information

Cloud networking (VITMMA02) DC network topology, Ethernet extensions

Cloud networking (VITMMA02) DC network topology, Ethernet extensions Cloud networking (VITMMA02) DC network topology, Ethernet extensions Markosz Maliosz PhD Department of Telecommunications and Media Informatics Faculty of Electrical Engineering and Informatics Budapest

More information

Probe or Wait : Handling tail losses using Multipath TCP

Probe or Wait : Handling tail losses using Multipath TCP Probe or Wait : Handling tail losses using Multipath TCP Kiran Yedugundla, Per Hurtig, Anna Brunstrom 12/06/2017 Probe or Wait : Handling tail losses using Multipath TCP Outline Introduction Handling tail

More information

DeTail Reducing the Tail of Flow Completion Times in Datacenter Networks. David Zats, Tathagata Das, Prashanth Mohan, Dhruba Borthakur, Randy Katz

DeTail Reducing the Tail of Flow Completion Times in Datacenter Networks. David Zats, Tathagata Das, Prashanth Mohan, Dhruba Borthakur, Randy Katz DeTail Reducing the Tail of Flow Completion Times in Datacenter Networks David Zats, Tathagata Das, Prashanth Mohan, Dhruba Borthakur, Randy Katz 1 A Typical Facebook Page Modern pages have many components

More information

Fastpass A Centralized Zero-Queue Datacenter Network

Fastpass A Centralized Zero-Queue Datacenter Network Fastpass A Centralized Zero-Queue Datacenter Network Jonathan Perry Amy Ousterhout Hari Balakrishnan Devavrat Shah Hans Fugal Ideal datacenter network properties No current design satisfies all these properties

More information

Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter. Glenn Judd Morgan Stanley

Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter. Glenn Judd Morgan Stanley Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter Glenn Judd Morgan Stanley 1 Introduction Datacenter computing pervasive Beyond the Internet services domain BigData, Grid Computing,

More information

Delay Controlled Elephant Flow Rerouting in Software Defined Network

Delay Controlled Elephant Flow Rerouting in Software Defined Network 1st International Conference on Advanced Information Technologies (ICAIT), Nov. 1-2, 2017, Yangon, Myanmar Delay Controlled Elephant Flow Rerouting in Software Defined Network Hnin Thiri Zaw, Aung Htein

More information

Towards Fully Synchronized (and Programmable) Datacenter Networks

Towards Fully Synchronized (and Programmable) Datacenter Networks Towards Fully Synchronized (and Programmable) Datacenter Networks Vishal Shrivastav, Cornell University 28 Jul 2016 University of Cambridge Outline DTP: Datacenter Time Protocol [SIGCOMM 16] SHOAL: Synchronized

More information

c-through: Part-time Optics in Data Centers

c-through: Part-time Optics in Data Centers Data Center Network Architecture c-through: Part-time Optics in Data Centers Guohui Wang 1, T. S. Eugene Ng 1, David G. Andersen 2, Michael Kaminsky 3, Konstantina Papagiannaki 3, Michael Kozuch 3, Michael

More information

Jellyfish: Networking Data Centers Randomly

Jellyfish: Networking Data Centers Randomly Jellyfish: Networking Data Centers Randomly Ankit Singla Chi-Yao Hong Lucian Popa Brighten Godfrey DIMACS Workshop on Systems and Networking Advances in Cloud Computing December 8 2011 The real stars...

More information

Cloud 3.0 and Software Defined Networking October 28, Amin Vahdat on behalf of Google Technical Infratructure Google Fellow

Cloud 3.0 and Software Defined Networking October 28, Amin Vahdat on behalf of Google Technical Infratructure Google Fellow Cloud 3.0 and Software Defined Networking October 28, 2016 Amin Vahdat on behalf of Google Technical Infratructure Google Fellow Overview This talk: example of the Google research model Driven by novel

More information

TCP improvements for Data Center Networks

TCP improvements for Data Center Networks TCP improvements for Data Center Networks Tanmoy Das and rishna M. Sivalingam Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai 636, India Emails: {tanmoy.justu@gmail.com,

More information

SOFTWARE DEFINED NETWORKS. Jonathan Chu Muhammad Salman Malik

SOFTWARE DEFINED NETWORKS. Jonathan Chu Muhammad Salman Malik SOFTWARE DEFINED NETWORKS Jonathan Chu Muhammad Salman Malik Credits Material Derived from: Rob Sherwood, Saurav Das, Yiannis Yiakoumis AT&T Tech Talks October 2010 (available at:www.openflow.org/wk/images/1/17/openflow_in_spnetworks.ppt)

More information

Optimal Network Flow Allocation. EE 384Y Almir Mutapcic and Primoz Skraba 27/05/2004

Optimal Network Flow Allocation. EE 384Y Almir Mutapcic and Primoz Skraba 27/05/2004 Optimal Network Flow Allocation EE 384Y Almir Mutapcic and Primoz Skraba 27/05/2004 Problem Statement Optimal network flow allocation Find flow allocation which minimizes certain performance criterion

More information

T9: SDN and Flow Management: DevoFlow

T9: SDN and Flow Management: DevoFlow T9: SDN and Flow Management: DevoFlow Critique Lee, Tae Ho 1. Problems due to switch HW may be temporary. HW will evolve over time. Section 3.3 tries to defend against this point, but none of the argument

More information

COCONUT: Seamless Scale-out of Network Elements

COCONUT: Seamless Scale-out of Network Elements COCONUT: Seamless Scale-out of Network Elements Soudeh Ghorbani P. Brighten Godfrey University of Illinois at Urbana-Champaign Simple abstractions Firewall Loadbalancer Router Network operating system

More information

DARD: A Practical Distributed Adaptive Routing Architecture for Datacenter Networks

DARD: A Practical Distributed Adaptive Routing Architecture for Datacenter Networks DARD: A Practical Distributed Adaptive Routing Architecture for Datacenter Networks Xin Wu and Xiaowei Yang Duke-CS-TR-20-0 {xinwu, xwy}@cs.duke.edu Abstract Datacenter networks typically have multiple

More information

Forwarding Architecture

Forwarding Architecture Forwarding Architecture Brighten Godfrey CS 538 February 14 2018 slides 2010-2018 by Brighten Godfrey unless otherwise noted Building a fast router Partridge: 50 Gb/sec router A fast IP router well, fast

More information

CONGA: Distributed Congestion-Aware Load Balancing for Datacenters

CONGA: Distributed Congestion-Aware Load Balancing for Datacenters : Distributed Congestion-Aware Load Balancing for Datacenters Mohammad Alizadeh, Tom Edsall, Sarang Dharmapurikar, Ramanan Vaidyanathan, Kevin Chu, Andy Fingerhut, Vinh The Lam, Francis Matus, Rong Pan,

More information

Machine-Learning-Based Flow scheduling in OTSSenabled

Machine-Learning-Based Flow scheduling in OTSSenabled Machine-Learning-Based Flow scheduling in OTSSenabled Datacenters Speaker: Lin Wang Research Advisor: Biswanath Mukherjee Motivation Traffic demand increasing in datacenter networks Cloud-service, parallel-computing,

More information

DETOUR: A Large-Scale Non-Blocking Optical Data Center Fabric

DETOUR: A Large-Scale Non-Blocking Optical Data Center Fabric DETOUR: A Large-Scale Non-Blocking Optical Data Center Fabric Yi Dai 2 Jinzhen Bao 1,2, Dezun Dong 2, Baokang Zhao 2 1. PLA Academy of Military Science 2. National University of Defense Technology SCA2018,

More information

Episode 5. Scheduling and Traffic Management

Episode 5. Scheduling and Traffic Management Episode 5. Scheduling and Traffic Management Part 3 Baochun Li Department of Electrical and Computer Engineering University of Toronto Outline What is scheduling? Why do we need it? Requirements of a scheduling

More information

Data Center TCP (DCTCP)

Data Center TCP (DCTCP) Data Center Packet Transport Data Center TCP (DCTCP) Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, Murari Sridharan Cloud computing

More information

GRIN: Utilizing the Empty Half of Full Bisection Networks

GRIN: Utilizing the Empty Half of Full Bisection Networks GRIN: Utilizing the Empty Half of Full Bisection Networks Alexandru Agache University Politehnica of Bucharest Costin Raiciu University Politehnica of Bucharest Abstract Various full bisection designs

More information

arxiv: v1 [cs.ni] 11 Oct 2016

arxiv: v1 [cs.ni] 11 Oct 2016 A Flat and Scalable Data Center Network Topology Based on De Bruijn Graphs Frank Dürr University of Stuttgart Institute of Parallel and Distributed Systems (IPVS) Universitätsstraße 38 7569 Stuttgart frank.duerr@ipvs.uni-stuttgart.de

More information

Distributed Multipath Routing for Data Center Networks based on Stochastic Traffic Modeling

Distributed Multipath Routing for Data Center Networks based on Stochastic Traffic Modeling Distributed Multipath Routing for Data Center Networks based on Stochastic Traffic Modeling Omair Fatmi and Deng Pan School of Computing and Information Sciences Florida International University Miami,

More information

B4 and After: Managing Hierarchy, Partitioning, and Asymmetry for Availability and Scale in Google's Software-Defined WAN

B4 and After: Managing Hierarchy, Partitioning, and Asymmetry for Availability and Scale in Google's Software-Defined WAN B4 and After: Managing Hierarchy, Partitioning, and Asymmetry for Availability and Scale in Google's Software-Defined WAN ( Chi ) Chi-yao Hong, Subhasree Mandal, Mohammad Al-Fares, Min Zhu, Richard Alimi,

More information

15-744: Computer Networking. Overview. Queuing Disciplines. TCP & Routers. L-6 TCP & Routers

15-744: Computer Networking. Overview. Queuing Disciplines. TCP & Routers. L-6 TCP & Routers TCP & Routers 15-744: Computer Networking RED XCP Assigned reading [FJ93] Random Early Detection Gateways for Congestion Avoidance [KHR02] Congestion Control for High Bandwidth-Delay Product Networks L-6

More information

Dynamic Distributed Flow Scheduling with Load Balancing for Data Center Networks

Dynamic Distributed Flow Scheduling with Load Balancing for Data Center Networks Available online at www.sciencedirect.com Procedia Computer Science 19 (2013 ) 124 130 The 4th International Conference on Ambient Systems, Networks and Technologies. (ANT 2013) Dynamic Distributed Flow

More information

On the Efficacy of Fine-Grained Traffic Splitting Protocols in Data Center Networks

On the Efficacy of Fine-Grained Traffic Splitting Protocols in Data Center Networks Purdue University Purdue e-pubs Department of Computer Science Technical Reports Department of Computer Science 2011 On the Efficacy of Fine-Grained Traffic Splitting Protocols in Data Center Networks

More information

Transport Layer (Congestion Control)

Transport Layer (Congestion Control) Transport Layer (Congestion Control) Where we are in the Course Moving on up to the Transport Layer! Application Transport Network Link Physical CSE 461 University of Washington 2 TCP to date: We can set

More information

Improving Datacenter Network Performance via Intelligent Network Edge. Keqiang He

Improving Datacenter Network Performance via Intelligent Network Edge. Keqiang He Improving Datacenter Network Performance via Intelligent Network Edge By Keqiang He A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer

More information

A Scalable, Commodity Data Center Network Architecture

A Scalable, Commodity Data Center Network Architecture A Scalable, Commodity Data Center Network Architecture B Y M O H A M M A D A L - F A R E S A L E X A N D E R L O U K I S S A S A M I N V A H D A T P R E S E N T E D B Y N A N X I C H E N M A Y. 5, 2 0

More information

STMS: Improving MPTCP Throughput Under Heterogenous Networks

STMS: Improving MPTCP Throughput Under Heterogenous Networks STMS: Improving MPTCP Throughput Under Heterogenous Networks Hang Shi 1, Yong Cui 1, Xin Wang 2, Yuming Hu 1, Minglong Dai 1, anzhao Wang 3, Kai Zheng 3 1 Tsinghua University, 2 Stony Brook University,

More information

THE increasing reliance on streaming mobile video from

THE increasing reliance on streaming mobile video from 1 OFLoad: An OpenFlow-based Dynamic Load Balancing Strategy for Datacenter Networks Ramona Trestian, Member, IEEE, Kostas Katrinis, and Gabriel-Miro Muntean, Senior Member, IEEE Abstract The latest tremendous

More information

CONGA: Distributed Congestion-Aware Load Balancing for Datacenters

CONGA: Distributed Congestion-Aware Load Balancing for Datacenters : Distributed Congestion-Aware Load Balancing for Datacenters Mohammad Alizadeh, Tom Edsall, Sarang Dharmapurikar, Ramanan Vaidyanathan, Kevin Chu, Andy Fingerhut, Vinh The Lam (Google), Francis Matus,

More information

Congestion Control In the Network

Congestion Control In the Network Congestion Control In the Network Brighten Godfrey cs598pbg September 9 2010 Slides courtesy Ion Stoica with adaptation by Brighten Today Fair queueing XCP Announcements Problem: no isolation between flows

More information

A Measurement Study on Multi-path TCP with Multiple Cellular Carriers on High Speed Rails

A Measurement Study on Multi-path TCP with Multiple Cellular Carriers on High Speed Rails ACM SIGCOMM 18, Budapest, Hungary A Measurement Study on Multi-path TCP with Multiple Cellular Carriers on High Speed Rails Li Li *, Ke Xu *, Tong Li, Kai Zheng, Chunyi Peng Γ, Dan Wang, Xiangxiang Wang

More information

POLYMORPHIC ON-CHIP NETWORKS

POLYMORPHIC ON-CHIP NETWORKS POLYMORPHIC ON-CHIP NETWORKS Martha Mercaldi Kim, John D. Davis*, Mark Oskin, Todd Austin** University of Washington *Microsoft Research, Silicon Valley ** University of Michigan On-Chip Network Selection

More information

CS 268: Computer Networking

CS 268: Computer Networking CS 268: Computer Networking L-6 Router Congestion Control TCP & Routers RED XCP Assigned reading [FJ93] Random Early Detection Gateways for Congestion Avoidance [KHR02] Congestion Control for High Bandwidth-Delay

More information

Multipath Transport, Resource Pooling, and implications for Routing

Multipath Transport, Resource Pooling, and implications for Routing Multipath Transport, Resource Pooling, and implications for Routing Mark Handley, UCL and XORP, Inc Also: Damon Wischik, UCL Marcelo Bagnulo Braun, UC3M The members of Trilogy project: www.trilogy-project.org

More information

Jupiter Rising: A Decade of Clos Topologies and Centralized Control in

Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network Authors: Arjun Singh, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armistead, Roy Bannon, Seb Boving,

More information

THE increasing reliance on streaming mobile video from

THE increasing reliance on streaming mobile video from 1 OFLoad: An OpenFlow-based Dynamic Load Balancing Strategy for Datacenter Networks Ramona Trestian, Member, IEEE, Kostas Katrinis, and Gabriel-Miro Muntean, Senior Member, IEEE Abstract The latest tremendous

More information

Pre-eminent Multi-path routing with maximum resource utilization in traffic engineering approach

Pre-eminent Multi-path routing with maximum resource utilization in traffic engineering approach Pre-eminent Multi-path routing with maximum resource utilization in traffic engineering approach S.Venkata krishna Kumar #1, T.Sujithra. *2 # Associate Professor, *Research Scholar Department of Computer

More information

Titan: Fair Packet Scheduling for Commodity Multiqueue NICs. Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift July 13 th, 2017

Titan: Fair Packet Scheduling for Commodity Multiqueue NICs. Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift July 13 th, 2017 Titan: Fair Packet Scheduling for Commodity Multiqueue NICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift July 13 th, 2017 Ethernet line-rates are increasing! 2 Servers need: To drive increasing

More information

Beyond fat-trees without antennae, mirrors, and disco-balls. Simon Kassing, Asaf Valadarsky, Gal Shahaf, Michael Schapira, Ankit Singla

Beyond fat-trees without antennae, mirrors, and disco-balls. Simon Kassing, Asaf Valadarsky, Gal Shahaf, Michael Schapira, Ankit Singla Beyond fat-trees without antennae, mirrors, and disco-balls Simon Kassing, Asaf Valadarsky, Gal Shahaf, Michael Schapira, Ankit Singla Skewed traffic within data centers 2 [Google] Skewed traffic within

More information

Leveraging Multi-path Diversity for Transport Loss Recovery in Data Centers

Leveraging Multi-path Diversity for Transport Loss Recovery in Data Centers Fast and Cautious: Leveraging Multi-path Diversity for Transport Loss Recovery in Data Centers Guo Chen Yuanwei Lu, Yuan Meng, Bojie Li, Kun Tan, Dan Pei, Peng Cheng, Layong (Larry) Luo, Yongqiang Xiong,

More information

Scalability and Resilience in SDN: an overview. Nicola Rustignoli

Scalability and Resilience in SDN: an overview. Nicola Rustignoli Scalability and Resilience in SDN: an overview Nicola Rustignoli rnicola@student.ethz.ch Nicola Rustignoli 11.03.2016 1 Software-Defined networking CONTROL LAYER SWITCH FORWARDING LAYER Graphs source:[1]

More information

Congestion Control in Datacenters. Ahmed Saeed

Congestion Control in Datacenters. Ahmed Saeed Congestion Control in Datacenters Ahmed Saeed What is a Datacenter? Tens of thousands of machines in the same building (or adjacent buildings) Hundreds of switches connecting all machines What is a Datacenter?

More information

XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet

XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet Vijay Shankar Rajanna, Smit Shah, Anand Jahagirdar and Kartik Gopalan Computer Science, State University of New York at Binghamton

More information

Advanced Computer Networks Exercise Session 7. Qin Yin Spring Semester 2013

Advanced Computer Networks Exercise Session 7. Qin Yin Spring Semester 2013 Advanced Computer Networks 263-3501-00 Exercise Session 7 Qin Yin Spring Semester 2013 1 LAYER 7 SWITCHING 2 Challenge: accessing services Datacenters are designed to be scalable Datacenters are replicated

More information

Baidu s Best Practice with Low Latency Networks

Baidu s Best Practice with Low Latency Networks Baidu s Best Practice with Low Latency Networks Feng Gao IEEE 802 IC NEND Orlando, FL November 2017 Presented by Huawei Low Latency Network Solutions 01 1. Background Introduction 2. Network Latency Analysis

More information

RDMA over Commodity Ethernet at Scale

RDMA over Commodity Ethernet at Scale RDMA over Commodity Ethernet at Scale Chuanxiong Guo, Haitao Wu, Zhong Deng, Gaurav Soni, Jianxi Ye, Jitendra Padhye, Marina Lipshteyn ACM SIGCOMM 2016 August 24 2016 Outline RDMA/RoCEv2 background DSCP-based

More information

Requirement Discussion of Flow-Based Flow Control(FFC)

Requirement Discussion of Flow-Based Flow Control(FFC) Requirement Discussion of Flow-Based Flow Control(FFC) Nongda Hu Yolanda Yu hunongda@huawei.com yolanda.yu@huawei.com IEEE 802.1 DCB, Stuttgart, May 2017 www.huawei.com new-dcb-yolanda-ffc-proposal-0517-v01

More information

DATA CENTER FABRIC COOKBOOK

DATA CENTER FABRIC COOKBOOK Do It Yourself! DATA CENTER FABRIC COOKBOOK How to prepare something new from well known ingredients Emil Gągała WHAT DOES AN IDEAL FABRIC LOOK LIKE? 2 Copyright 2011 Juniper Networks, Inc. www.juniper.net

More information

Scaling Hardware Accelerated Network Monitoring to Concurrent and Dynamic Queries with *Flow

Scaling Hardware Accelerated Network Monitoring to Concurrent and Dynamic Queries with *Flow Scaling Hardware Accelerated Network Monitoring to Concurrent and Dynamic Queries with *Flow John Sonchack, Oliver Michel, Adam J. Aviv, Eric Keller, Jonathan M. Smith Measuring High Speed Networks 00

More information

Lecture 7: Data Center Networks

Lecture 7: Data Center Networks Lecture 7: Data Center Networks CSE 222A: Computer Communication Networks Alex C. Snoeren Thanks: Nick Feamster Lecture 7 Overview Project discussion Data Centers overview Fat Tree paper discussion CSE

More information

Low Latency via Redundancy

Low Latency via Redundancy Low Latency via Redundancy Ashish Vulimiri, Philip Brighten Godfrey, Radhika Mittal, Justine Sherry, Sylvia Ratnasamy, Scott Shenker Presenter: Meng Wang 2 Low Latency Is Important Injecting just 400 milliseconds

More information

Outlines. Introduction (Cont d) Introduction. Introduction Network Evolution External Connectivity Software Control Experience Conclusion & Discussion

Outlines. Introduction (Cont d) Introduction. Introduction Network Evolution External Connectivity Software Control Experience Conclusion & Discussion Outlines Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network Singh, A. et al. Proc. of ACM SIGCOMM '15, 45(4):183-197, Oct. 2015 Introduction Network Evolution

More information

Duet: Cloud Scale Load Balancing with Hardware and Software

Duet: Cloud Scale Load Balancing with Hardware and Software Duet: Cloud Scale Load Balancing with Hardware and Software Rohan Gandhi Hongqiang Harry Liu Y. Charlie Hu Guohan Lu Jitendra Padhye Lihua Yuan Ming Zhang Microsoft, Purdue University, Yale University

More information

F10: A Fault- Tolerant Engineered Network

F10: A Fault- Tolerant Engineered Network F10: A Fault- Tolerant Engineered Network Vincent Liu, Daniel Halperin, Arvind Krishnamurthy, Thomas Anderson University of Washington Today s Data Centers *From Al- Fares et al. SIGCOMM 08 Today s data

More information

Hardware Evolution in Data Centers

Hardware Evolution in Data Centers Hardware Evolution in Data Centers 2004 2008 2011 2000 2013 2014 Trend towards customization Increase work done per dollar (CapEx + OpEx) Paolo Costa Rethinking the Network Stack for Rack-scale Computers

More information

Optimizing Performance: Intel Network Adapters User Guide

Optimizing Performance: Intel Network Adapters User Guide Optimizing Performance: Intel Network Adapters User Guide Network Optimization Types When optimizing network adapter parameters (NIC), the user typically considers one of the following three conditions

More information

Packet Scheduling in Data Centers. Lecture 17, Computer Networks (198:552)

Packet Scheduling in Data Centers. Lecture 17, Computer Networks (198:552) Packet Scheduling in Data Centers Lecture 17, Computer Networks (198:552) Datacenter transport Goal: Complete flows quickly / meet deadlines Short flows (e.g., query, coordination) Large flows (e.g., data

More information

Multipath Routing Protocol for Congestion Control in Mobile Ad-hoc Network

Multipath Routing Protocol for Congestion Control in Mobile Ad-hoc Network 1 Multipath Routing Protocol for Congestion Control in Mobile Ad-hoc Network Nilima Walde, Assistant Professor, Department of Information Technology, Army Institute of Technology, Pune, India Dhananjay

More information

15-744: Computer Networking. Data Center Networking II

15-744: Computer Networking. Data Center Networking II 15-744: Computer Networking Data Center Networking II Overview Data Center Topology Scheduling Data Center Packet Scheduling 2 Current solutions for increasing data center network bandwidth FatTree BCube

More information

Spotlight: Scalable Transport Layer Load Balancing for Data Center Networks

Spotlight: Scalable Transport Layer Load Balancing for Data Center Networks 1 Spotlight: Scalable Transport Layer Load Balancing for Data Center Networks Ashkan Aghdai, Cing-Yu Chu, Yang Xu, David H. Dai, Jun Xu, H. Jonathan Chao New York University arxiv:1806.08455v1 [cs.ni]

More information

Cutting the Cord: A Robust Wireless Facilities Network for Data Centers

Cutting the Cord: A Robust Wireless Facilities Network for Data Centers Cutting the Cord: A Robust Wireless Facilities Network for Data Centers Yibo Zhu, Xia Zhou, Zengbin Zhang, Lin Zhou, Amin Vahdat, Ben Y. Zhao and Haitao Zheng U.C. Santa Barbara, Dartmouth College, U.C.

More information

Introduction. Network Architecture Requirements of Data Centers in the Cloud Computing Era

Introduction. Network Architecture Requirements of Data Centers in the Cloud Computing Era Massimiliano Sbaraglia Network Engineer Introduction In the cloud computing era, distributed architecture is used to handle operations of mass data, such as the storage, mining, querying, and searching

More information

WCMP: Weighted Cost Multipathing for Improved Fairness in Data Centers

WCMP: Weighted Cost Multipathing for Improved Fairness in Data Centers WCMP: Weighted Cost Multipathing for Improved Fairness in Data Centers Junlan Zhou, Malveeka Tewari, Min Zhu, Abdul Kabbani, Leon Poutievski, Arjun Singh, Amin Vahdat University of California, San Diego

More information

On low-latency-capable topologies, and their impact on the design of intra-domain routing

On low-latency-capable topologies, and their impact on the design of intra-domain routing On low-latency-capable topologies, and their impact on the design of intra-domain routing Nikola Gvozdiev, Stefano Vissicchio, Brad Karp, Mark Handley University College London (UCL) We want low latency!

More information