BSDCan 2015 June 13 th Extensions to FreeBSD Datacenter TCP for Incremental Deployment Support. Midori Kato

Similar documents
Extensions to FreeBSD Datacenter TCP for Incremental Deployment Support

Master of Thesis Academic Year Improving transmission performance with one-sided datacenter TCP

Data Center TCP(DCTCP)

Cloud e Datacenter Networking

TCP modifications for Congestion Exposure

Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter. Glenn Judd Morgan Stanley

Data Center TCP (DCTCP)

Data Center TCP (DCTCP)

Computer Networking

Multimedia-unfriendly TCP Congestion Control and Home Gateway Queue Management

Transport Protocols for Data Center Communication. Evisa Tsolakou Supervisor: Prof. Jörg Ott Advisor: Lect. Pasi Sarolahti

Internet Engineering Task Force. Intended status: Experimental University Politehnica of Bucharest Expires: January 21, 2016 July 20, 2015

Congestion Control in Datacenters. Ahmed Saeed

Internet Engineering Task Force (IETF) Request for Comments: Netflix G. Fairhurst University of Aberdeen December 2018

TCP SIAD: Congestion Control supporting Low Latency and High Speed

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2015

Lecture 15: Datacenter TCP"

Preprint. Until published, please cite as:

Sebastian Zander, Grenville Armitage. Centre for Advanced Internet Architectures (CAIA) Swinburne University of Technology

DIBS: Just-in-time congestion mitigation for Data Centers

Performance Evaluation of NEMO Basic Support Implementations

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2014

Making TCP more Robust against Packet Reordering

A Method Based on Data Fragmentation to Increase the Performance of ICTCP During Incast Congestion in Networks

Activity-Based Congestion Management for Fair Bandwidth Sharing in Trusted Packet Networks

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2014

TCP Incast problem Existing proposals

State of ECN and improving congestion feedback with AccECN in Linux

Investigating the Use of Synchronized Clocks in TCP Congestion Control

Multi-class Applications for Parallel Usage of a Guaranteed Rate and a Scavenger Service

Speeding up Linux TCP/IP with a Fast Packet I/O Framework

TCP so far Computer Networking Outline. How Was TCP Able to Evolve

CS519: Computer Networks. Lecture 5, Part 4: Mar 29, 2004 Transport: TCP congestion control

A Proposal to add Explicit Congestion Notification (ECN) to IPv6 and to TCP

Request for Comments: S. Floyd ICSI K. Ramakrishnan AT&T Labs Research June 2009

Router s Queue Management

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2015

Can Congestion-controlled Interactive Multimedia Traffic Co-exist with TCP? Colin Perkins

The Network Layer and Routers

Internet Engineering Task Force (IETF) Category: Informational. March The Benefits of Using Explicit Congestion Notification (ECN)

Lab Test Report DR100401D. Cisco Nexus 5010 and Arista 7124S

Designing a Resource Pooling Transport Protocol

Advanced Computer Networks. Datacenter TCP

A NEW CONGESTION MANAGEMENT MECHANISM FOR NEXT GENERATION ROUTERS

ADVANCED TOPICS FOR CONGESTION CONTROL

Computer Networks Spring 2017 Homework 2 Due by 3/2/2017, 10:30am

ECEN Final Exam Fall Instructor: Srinivas Shakkottai

Differential Congestion Notification: Taming the Elephants

Master Course Computer Networks IN2097

RPT: Re-architecting Loss Protection for Content-Aware Networks

Titan: Fair Packet Scheduling for Commodity Multiqueue NICs. Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift July 13 th, 2017

CS268: Beyond TCP Congestion Control

Data Center TCP (DCTCP)

Congestion Control. Daniel Zappala. CS 460 Computer Networking Brigham Young University

Transmission Control Protocol. ITS 413 Internet Technologies and Applications

image 3.8 KB Figure 1.6: Example Web Page

Experimental Study of TCP Congestion Control Algorithms

More Accurate ECN Feedback in TCP draft-ietf-tcpm-accurate-ecn-04

TCP improvements for Data Center Networks

TCP Congestion Control

Rapid PHY Selection (RPS): A Performance Evaluation of Control Policies

Got Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat. ACM SIGCOMM 2013, August, Hong Kong, China

15-744: Computer Networking. Overview. Queuing Disciplines. TCP & Routers. L-6 TCP & Routers

DeTail Reducing the Tail of Flow Completion Times in Datacenter Networks. David Zats, Tathagata Das, Prashanth Mohan, Dhruba Borthakur, Randy Katz

Outline Computer Networking. TCP slow start. TCP modeling. TCP details AIMD. Congestion Avoidance. Lecture 18 TCP Performance Peter Steenkiste

TCP Maintenance Working Group. Intended status: Experimental Expires: January 7, 2017 July 6, 2016

CS 268: Computer Networking

Impact of bandwidth-delay product and non-responsive flows on the performance of queue management schemes

Managing Caching Performance and Differentiated Services

Lecture 21: Congestion Control" CSE 123: Computer Networks Alex C. Snoeren

P802.1Qcz Congestion Isolation

TVA: A DoS-limiting Network Architecture L

Fast Retransmit. Problem: coarsegrain. timeouts lead to idle periods Fast retransmit: use duplicate ACKs to trigger retransmission

Congestion Avoidance

Paperspace. Architecture Overview. 20 Jay St. Suite 312 Brooklyn, NY Technical Whitepaper

XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet

CS 356: Computer Network Architectures Lecture 19: Congestion Avoidance Chap. 6.4 and related papers. Xiaowei Yang

The Transport Layer Congestion control in TCP

Sizing Router Buffers

TCP modifications for Congestion Exposure

Announcements Computer Networking. Outline. Transport Protocols. Transport introduction. Error recovery & flow control. Mid-semester grades

Congestion Control for High Bandwidth-delay Product Networks. Dina Katabi, Mark Handley, Charlie Rohrs

RoGUE: RDMA over Generic Unconverged Ethernet

Information-Agnostic Flow Scheduling for Commodity Data Centers

EXPERIENCES EVALUATING DCTCP. Lawrence Brakmo, Boris Burkov, Greg Leclercq and Murat Mugan Facebook

TCP Modifications for Congestion Exposure

MAD 12 Monitoring the Dynamics of Network Traffic by Recursive Multi-dimensional Aggregation. Midori Kato, Kenjiro Cho, Michio Honda, Hideyuki Tokuda

Overview. TCP congestion control Computer Networking. TCP modern loss recovery. TCP modeling. TCP Congestion Control AIMD

Tuning RED for Web Traffic

Design and Evaluation of Schemes for More Accurate ECN Feedback

Cross-Layer Flow and Congestion Control for Datacenter Networks

Experimental Experiences with Emulating Multihop Path using Dummynet

Network Management & Monitoring

Congestion Control in TCP

Handles all kinds of traffic on a single network with one class

Studying Congestion Control with Explicit Router Feedback Using Hardware-based Network Emulator

TCP Congestion Control : Computer Networking. Introduction to TCP. Key Things You Should Know Already. Congestion Control RED

Recap. TCP connection setup/teardown Sliding window, flow control Retransmission timeouts Fairness, max-min fairness AIMD achieves max-min fairness

Packet Scheduling in Data Centers. Lecture 17, Computer Networks (198:552)

Performance Comparison of TFRC and TCP

Transcription:

BSDCan 2015 June 13 th Extensions to FreeBSD Datacenter TCP for Incremental Deployment Support Midori Kato <katoon@sfc.wide.ad.jp>

DCTCP has been available since FreeBSD 11.0!! 2

FreeBSD DCTCP highlight What is the average transmission time of 10 800KB transfers while 2 long flows are running in the same link? TCP (NewReno): 252.7 msec DCTCP: 82.5 msec One-sided DCTCP: 89.4 msec Using BSD DCTCP achieves faster data transfer than using TCP not only in the fully deployed network but also in the partially deployed network. 3

Outline 1. DCTCP introduction What is DCTCP? What benefit can you receive by using DCTCP? Necessary network equipment for DCTCP 2. FreeBSD DCTCP feature 3. How to configure DCTCP on FreeBSD 4

DCTCP (Datacenter TCP) [*] A proposed TCP variant that solves performance impairments in the data center network Traditional TCP problem Concurrent interactive short flows show long data transmission time in a traffic mix of short flows and long flows Short flows: database query and response Long flows: bulk transfer for server migration DCTCP contribution 1. Maintain low and predictable latency for short flows 2. Absorb traffic bursts 3. Maintain high throughput for long flows [*] M. Alizadeh et al. Data Center TCP (DCTCP). In Proc. ACM SIGCOMM. 2010, pp. 63 74. 5

Data Center Network Data Center Network The Internet a domain the Internet Servers.... a domain a domain Where is the destination host? Data center network Single domain The Internet Multiple domains Sensitive to what? Data transmission time Throughput

Data Center Network Data Center Network The Internet a domain Requirements of running services.... Servers Single domain the Internet Feature of data center network Easy to optimize the network equipment and operation Short delay and no data re-transmission Single domain Where is the destination host? Data center network Single domain The Internet Multiple domains Sensitive to what? Data transmission time Throughput

DCTCP approach Leverage ECN (Explicit Congestion Notification) to the network 8

ECN (Explicit Congestion Notification) ECN is a traditional active queue management scheme Provide auxiliary information for TCP congestion control ECN motivation make hosts transfer data without packet losses Network equipments supporting ECN are needed L3 switches/routers and servers Check the configuration of network equipments Many of servers unset ECN as default 9

ECN mechanism Traditional queuing management Window size calculation with a packet loss W <- W / 2 Queuing management with ECN sender Lost packet ECN Echo receiver Window size calculation with ECN reception W <- W / 2 sender threshold receiver L3 switch Set ECN when the queue length exceeds the threshold Sending hosts ECN mark Halve the window size by receiving a ECN packet Window size: the total amount of data in byte sent by the host 10

DCTCP mechanism Use ECN for the precise estimation of the potential congestion L3 switch Set ECN when the queue length exceeds the threshold Sending hosts Calculate the window size based on the fraction of ECN in a previous window size Window size calculation with ECN reception F <- frac. Of ECN pkts in W W <- W * F / 2 threshold 11

Testbed Hosts: four x86 machines FreeBSD-CURRENT 10.0 Two dual-core Intel Xeon E5240 CPUs @ 3 GHz 16GB RAM Four ports Intel PRO/1000 1G Ethernet card L3 Switch: Cisco Nexus 3548 Threshold of queue length: 10packets Traffic generator: flowgrind [*] [*] Alexander Zimmermann, Arnd Hannemann, and Tim Kosse. Flowgrind - a new performance measurement tool. In GLOBECOM, pages 1 6. IEEE, December 2010 12

Experiment Incast Objective: evaluate the TCP/DCTCP performance of burst transfers Scenarios: run 10 flows simultaneously Transfer size: 10 800KB Metric: avg. of data transmission time for short flows Bulk transfer Objective: Evaluate the TCP/DCTCP performance by running a mix of short and long flows Scenarios: starts 10 short flows 500 milliseconds after 2 long flows run Transfer size of short flows: 10 800KB Transfer size of long flows: 40MB Metric: avg. of data transmission time for short and long flows 13

Result of Incast experiments 60.4 55.3 DCTCP flows reduce 5 milliseconds of data transmission time at most 14

Result of bulk transfer experiments 252.7 33.7 82.5 1.6 With background flows running, DCTCP short flows reduces 10KB transfer time by 32.1 milliseconds 800KB transfer time by 170.2 milliseconds 15

Summary of DCTCP Q. What is DCTCP? A. DCTCP is a proposed TCP variant using ECN for data center network Q. What benefit can you receive by using DCTCP? A. You can reduce 170 milliseconds of data transmission time for 800KB transfer in a mix traffic of short and long flows Q. What are necessary network equipments for DCTCP? A. L3 switch, routers and servers supporting ECN 16

BSD DCTCP = DCTCP + α α means 1. Incremental deployment support 2. Initial window size calculation for performance tuning 17

Why is incremental deployment support important? [1/2] ECN Echo DCTCP server DCTCP server threshold ECN mark DCTCP server ECN TCP server threshold Q. Do you recognize the difference between two example? A. YES for us but NO for servers 18

Why is incremental deployment support important? [2/2] DCTCP server (e.g., application server) threshold ECN TCP server (e.g., appliance servers) Using DCTCP on a one-sided server is a possible situation in the network where appliance servers are running 19

BSD DCTCP for incremental deployment One-sided DCTCP Problem (worst case) Trans. Time of one-sided DCTCP >> trans. Time of two-sided DCTCP Solution Support compatibility with ECN See our paper for detail Result One-sided DCTCP remarks 90% of similar performance to two-sided DCTCP Incast: +0.09 milliseconds Bulk transfer: +6.9msec milliseconds 20

BSD DCTCP for initial window size calculation Selectable parameter considering the trade-off between latency and throughput slowstart = 0 slowstart = 1 (recommended[*]) Benefit Higher throughput Short latency during data transmission Friendly to competitive running flows Flaws Unfriendly to competitive running flows Longer data transmission time [*] Stephen Bensley, Lars Eggert, Dave Thaler Microsoft s Datacenter TCP (DCTCP): TCP Congestion Control for Datacenters IETF, draft-bensley-tcpm-dctcp-03.txt 21

How to configure DCTCP on FreeBSD 1. Load DCTCP module # kldload cc_dctcp 2. Check available congestion control algorithms # sysctl net.inet.tcp.cc.available newreno, dctcp 3. Enable ECN # sysctl -w net.inet.tcp.ecn.enable=1 4. Enable DCTCP * Support for incremental DCTCP deployment is included # sysctl -w net.inet.tcp.cc.algorithm=dctcp 5. Tune DCTCP parameter (optional) For faster data transfer (default) # sysctl -w net.inet.tcp.cc.dctcp.slowstart=0 For short latency # sysctl -w net.inet.tcp.cc.dctcp.slowstart=1 22

Summary of BSD DCTCP BSD DCTCP = DCTCP + α BSD DCTCP minimizes the performance penalty even in the partial use BSD DCTCP has an optional configuration considering trade-off between latency and throughput 23

Acknowledgement Many thanks to BSD developers Hiren Panchasara Lawrence Stewart Grenville Armitage KEIO University Hideyuki Tokuda Kenjiro Cho Rod Van Meter NetApp Germany Lars Eggert Alexander Zimmermann Richard Scheffenegger Cisco Lucien Avramov IETF Mirja Kuhlewind Michio Honda 24