TCP Tuning for the Web

Similar documents
Tuning TCP and NGINX on EC2

20: Networking (2) TCP Socket Buffers. Mark Handley. TCP Acks. TCP Data. Application. Application. Kernel. Kernel. Socket buffer.

ECE 435 Network Engineering Lecture 10

UDP, TCP, IP multicast

TCP so far Computer Networking Outline. How Was TCP Able to Evolve

CS419: Computer Networks. Lecture 10, Part 2: Apr 11, 2005 Transport: TCP mechanics (RFCs: 793, 1122, 1323, 2018, 2581)

Outline Computer Networking. TCP slow start. TCP modeling. TCP details AIMD. Congestion Avoidance. Lecture 18 TCP Performance Peter Steenkiste

Transport layer. UDP: User Datagram Protocol [RFC 768] Review principles: Instantiation in the Internet UDP TCP

Transport layer. Review principles: Instantiation in the Internet UDP TCP. Reliable data transfer Flow control Congestion control

Overview. TCP & router queuing Computer Networking. TCP details. Workloads. TCP Performance. TCP Performance. Lecture 10 TCP & Routers

Avi Networks Technical Reference (16.3)

8. TCP Congestion Control

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 21: Network Protocols (and 2 Phase Commit)

CSCI-GA Operating Systems. Networking. Hubertus Franke

CS519: Computer Networks. Lecture 5, Part 4: Mar 29, 2004 Transport: TCP congestion control

9th Slide Set Computer Networks

Wireless TCP Performance Issues

TCP. CSU CS557, Spring 2018 Instructor: Lorenzo De Carli (Slides by Christos Papadopoulos, remixed by Lorenzo De Carli)

CSE/EE 461 Lecture 13 Connections and Fragmentation. TCP Connection Management

Page 1. Goals for Today" Discussion" Example: Reliable File Transfer" CS162 Operating Systems and Systems Programming Lecture 11

Outline. What is TCP protocol? How the TCP Protocol Works SYN Flooding Attack TCP Reset Attack TCP Session Hijacking Attack

Two approaches to Flow Control. Cranking up to speed. Sliding windows in action

CS4700/CS5700 Fundamentals of Computer Networks

Information Network 1 TCP 1/2. Youki Kadobayashi NAIST

Switch Configuration message sent 1 (1, 0, 1) 2

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2014

TCP Performance. EE 122: Intro to Communication Networks. Fall 2006 (MW 4-5:30 in Donner 155) Vern Paxson TAs: Dilip Antony Joseph and Sukun Kim

CS457 Transport Protocols. CS 457 Fall 2014

Sequence Number. Acknowledgment Number. Checksum. Urgent Pointer plus Sequence Number indicates end of some URGENT data in the packet

TCP/IP Performance ITL

CMSC 417. Computer Networks Prof. Ashok K Agrawala Ashok Agrawala. October 25, 2018

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2014

Transport Layer Marcos Vieira

6.1 Internet Transport Layer Architecture 6.2 UDP (User Datagram Protocol) 6.3 TCP (Transmission Control Protocol) 6. Transport Layer 6-1

05 Transmission Control Protocol (TCP)

Chapter 24. Transport-Layer Protocols

TCP Basics : Computer Networking. Overview. What s Different From Link Layers? Introduction to TCP. TCP reliability Assigned reading

Performance Tuning NGINX

Configuring attack detection and prevention 1

Student ID: CS457: Computer Networking Date: 3/20/2007 Name:

Recap. TCP connection setup/teardown Sliding window, flow control Retransmission timeouts Fairness, max-min fairness AIMD achieves max-min fairness

Fast Retransmit. Problem: coarsegrain. timeouts lead to idle periods Fast retransmit: use duplicate ACKs to trigger retransmission

Closed book. Closed notes. No electronic device.

MPTCP: Design and Deployment. Day 11

Flow and Congestion Control

TCP Strategies. Keepalive Timer. implementations do not have it as it is occasionally regarded as controversial. between source and destination

CS 43: Computer Networks. 18: Transmission Control Protocol October 12-29, 2018

Transport Protocols Reading: Sections 2.5, 5.1, and 5.2. Goals for Todayʼs Lecture. Role of Transport Layer

Internet Layers. Physical Layer. Application. Application. Transport. Transport. Network. Network. Network. Network. Link. Link. Link.

Introduction to Networking. Operating Systems In Depth XXVII 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

EE 122: Transport Protocols. Kevin Lai October 16, 2002

Lecture 8. TCP/IP Transport Layer (2)

CSCD 330 Network Programming Winter 2015

Video Streaming with the Stream Control Transmission Protocol (SCTP)

The Network Stack (2)

CSE/EE 461 Lecture 12 TCP. A brief Internet history...

Advanced Network Design

UNIT IV -- TRANSPORT LAYER

ECE 650 Systems Programming & Engineering. Spring 2018

Question 1 (6 points) Compare circuit-switching and packet-switching networks based on the following criteria:

Advanced Computer Networks. End Host Optimization

bitcoin allnet exam review: transport layer TCP basics congestion control project 2 Computer Networks ICS 651

Intro to LAN/WAN. Transport Layer

Riptide: Jump Starting Back-Office Connections in Cloud Systems

Transport Layer Review

Transport Over IP. CSCI 690 Michael Hutt New York Institute of Technology

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2015

ETSF10 Internet Protocols Transport Layer Protocols

Congestion. Can t sustain input rate > output rate Issues: - Avoid congestion - Control congestion - Prioritize who gets limited resources

Transport protocols. Transport Layer 3-1

Some of the slides borrowed from the book Computer Security: A Hands on Approach by Wenliang Du. TCP Attacks. Chester Rebeiro IIT Madras

NT1210 Introduction to Networking. Unit 10

Transmission Control Protocol. ITS 413 Internet Technologies and Applications

Deployment Guide AX Series with Oracle E-Business Suite 12

Transport: How Applications Communicate

CS 43: Computer Networks. 19: TCP Flow and Congestion Control October 31, Nov 2, 2018

Documents. Configuration. Important Dependent Parameters (Approximate) Version 2.3 (Wed, Dec 1, 2010, 1225 hours)

Different Layers Lecture 20

Transport Protocols Reading: Sections 2.5, 5.1, and 5.2

Lecture 12: TCP Friendliness, DCCP, NATs, and STUN

Lecture 10: TCP Friendliness, DCCP, NATs, and STUN

Configuring attack detection and prevention 1

To see the details of TCP (Transmission Control Protocol). TCP is the main transport layer protocol used in the Internet.

TCP Review. Carey Williamson Department of Computer Science University of Calgary Winter 2018

CSE/EE 461 Lecture 16 TCP Congestion Control. TCP Congestion Control

Congestion Control in TCP

Transport Layer TCP / UDP

Bandwidth Allocation & TCP

CS519: Computer Networks. Lecture 5, Part 1: Mar 3, 2004 Transport: UDP/TCP demux and flow control / sequencing

II. Principles of Computer Communications Network and Transport Layer

Transport Layer. <protocol, local-addr,local-port,foreign-addr,foreign-port> ϒ Client uses ephemeral ports /10 Joseph Cordina 2005

CSC 4900 Computer Networks: TCP

The aim of this unit is to review the main concepts related to TCP and UDP transport protocols, as well as application protocols. These concepts are

Congestion Control In The Internet Part 2: How it is implemented in TCP. JY Le Boudec 2015

User Datagram Protocol

Performance Optimisations for HPC workloads. August 2008 Imed Chihi

Outline. TCP: Overview RFCs: 793, 1122, 1323, 2018, steam: r Development of reliable protocol r Sliding window protocols

Some slides courtesy David Wetherall. Communications Software. Lecture 4: Connections and Flow Control. CSE 123b. Spring 2003.

Department of Computer and IT Engineering University of Kurdistan. Transport Layer. By: Dr. Alireza Abdollahpouri

Internet Protocol and Transmission Control Protocol

Transcription:

TCP Tuning for the Web Jason Cook - @macros - jason@fastly.com

Me Co-founder and Operations at Fastly Former Operations Engineer at Wikia Lots of Sysadmin and Linux consulting

The Goal Make the best use of our limited resources to deliver the best user experience

Focus

Linux I like it I use it It won t hurt my feelings if you don t Examples will be aimed primarily at linux

Small Requests Optimizing towards small objects like html and js/css

Just Enough TCP Not a deep dive into TCP

The accept() loop

Entry point from the kernel to your application Client sends SYN Kernel hands SYN to Server Server calls accept() Kernel sends SYN/ACK Client sends ACK

Backlog Number of connections allowed to be in a SYN state Kernel will drop new SYNs when this limit is hit Clients wait 3s before trying again, then 9s on second failure

Backlog Tuning (kernel side) net.ipv4.tcp_max_syn_backlog and net.core.somaxconn Default value of 1024 is for systems with more than 128MB of memory 64 bytes per entry Undocumented max of 65535

Backlog Tuning (app side) Set when you call listen() nginx, redis, apache default to 511 mysql default of 50

DDoS Handling

The SYN Flood Resource exhaustion attack Cheaper for attacker than target Client sends SYN with bogus return address Until the ACK is completed the connection occupies a slot in the backlog queue

The SYN Cookie When enabled, kicks in when the backlog queue is full Sends a slightly more expensive but carefully crafted SYN/ACK then drops the SYN from the queue If the client responds to the SYN/ACK it can rebuild the original SYN and proceed Does disable large windows when active, but better than dropping entirely

Dealing with a SYN Flood Default to syncookies enabled and alert when triggered tcpdump/wireshark Frequently attacks have a detectable signature such as all having the same initial window size iptables is very flexible for matching these signatures, but can be expensive If hardware filters are available, use iptables to identify and hardware to block

The Joys of Hardware

Queues Modern network hardware is multi-queue By default assigns a queue per cpu core Should get even balancing of incoming requests, irqbalance can mess that up Intel ships a script with their drivers to aid in static assignment to avoid irqbalance

Packet Filters Intel and others have packet filters in their nics Small, only 128 wide in the intel 82599 Much more limited matchers than iptables src,dst,type,port,vlan

Hardware Flow Director Same mechanism as filters, just includes a mapping destination Set affinity to put both queue and app on same core Good for things like SSH and BGPD Maintain access in the face of an attack on other services

TCP Offload Engines

Full Offload Not so good on the public net Limited buffer resources on card Black box from a security perspective Can break features like filtering and QoS

Partial Offload Better, but with their own caveats

Large Receive Offload Collapses packets into a larger buffer before handing to OS Great for large volume ingress, which is not http Doesn t work with ipv6

Generic Receive Offload Similar to LRO, but more careful Will only merge safe packets into a buffer Will flush at timestamp tick Usually a win and you should test

TCP Segementation OS fills a 64KB buffer and produces a single header template NIC splits the buffer into segements and checksums before sending Can save lots of overhead, but not much of a win in small request/ response cycles

TX/RX Checksumming For small packets there is almost no win here

Bonding Linux bonding driver uses a single queue, large bottleneck for high packet rates teaming driver should be better, userspace tools only worked in modern fedora core so gave up Myricom hardware can do bonding natively

TCP Slow Start

Why Slow Start Early TCP implementations allowed the sender to immediately send as much data as the client window allowed In 1986 the internet suffered first congestion collapse 1000x reduction in effective throughput 1988 Van Jacobsen proposes Slow Start

Slow Start Goal is to avoid putting more data in flight than there is bandwidth available Client sends a receive window size of how much data they can buffer Server sends an inital burst based on server inital congestion window Double the window size for each received ACK Increases until a packet drop or slow start threshold is reached

Tuning Slow Start Increase your congestion window 2.6.39+ defaults to 10 2.6.19+ can be set as a path attribute ip route change default via 172.16.0.1 dev eth0 proto static initcwnd 10

Proportional Rate Reduction Added in linux 3.2 Prior to PRR a loss would halve to congestion window potentially below the slow start threshold PRR paces retransmits to smooth out Makes disabling net.ipv4.tcp_no_metrics_save safer

TCP Buffering

Throughput = Buffer Size / Latency

Buffer Tuning 16MB buffer at 50ms RTT = 320MB/s max rate Set as 3 values; min, default, and max net.ipv4.tcp_rmem = 4096 65536 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216

TIME_WAIT State entered after a server has closed the connection Kept around in case of delayed duplicate ACKs to our FIN

Busy servers collect a lot Default timeout is 2 * FIN timeout, so120s in linux Worth dropping FIN timeout net.ipv4.tcp_fin_timeout = 10

Tuning TIME_WAIT net.ipv4.tcp_tw_reuse=1 Reuse sockets in TIME_WAIT if safe to do so net.ipv4.tcp_max_tw_buckets Default of 131072, way too small for most sites Connections/s * timeout value

net.ipv4.tcp_tw_recyle EVIL!

net.ipv4.tcp_tw_reuse Allows the reuse of a TIME_WAIT socket if the client s timestamp increases Silently drops SYNs from the client if they don t, like can happen behind NAT Still useful for high churn servers when you can make assumptions of local network

SSL and Keepalives Enable them if at all possible The initial handshake is the most computationally intensive part 1-10ms on modern hardware 6 * RTT before user gets data really sucks over long distances SV to LON = 160ms = 1s before user starts getting content

SSL and Geo If you can move SSL termination closer to you users, do so Most CDNs offer this, even for non-cacheable things EC2 and Route53 is another option

In closing Upgrade your kernel Increase your initial congestion window Check your backlog and time wait limits Size your buffers to something reasonable Get closer to your users if you can

Thanks! Jason Cook @macros jason@fastly.com