netfilters connection tracking subsystem

Similar documents
netfilter/iptables/conntrack debugging

IPtables and Netfilter


User Datagram Protocol

CSE/EE 461 Lecture 13 Connections and Fragmentation. TCP Connection Management

Toward an ebpf-based clone of iptables

it isn't impossible to filter most bad traffic at line rate using iptables.

rtnl mutex, the network stack big kernel lock

sottotitolo A.A. 2016/17 Federico Reghenzani, Alessandro Barenghi

Netfilter updates since last NetDev. NetDev 2.2, Seoul, Korea (Nov 2017) Pablo Neira Ayuso

A 10 years journey in Linux firewalling Pass the Salt, summer 2018 Lille, France Pablo Neira Ayuso

A Study on Intrusion Detection Techniques in a TCP/IP Environment

IPv6 NAT. Open Source Days 9th-10th March 2013 Copenhagen, Denmark. Patrick McHardy

Netfilter updates since last NetDev. NetDev 2.2, Seoul, Korea (Nov 2017) Pablo Neira Ayuso

Computer Network Programming. The Transport Layer. Dr. Sam Hsu Computer Science & Engineering Florida Atlantic University

Mobile Transport Layer Lesson 02 TCP Data Stream and Data Delivery

A hacker in a hoodie with leather gloves tapping a glowing blue lock icon on a transparent touchscreen with ones and zeroes raining down in green

Suricata IDPS and Nftables: The Mixed Mode

Linux Firewalls. Frank Kuse, AfNOG / 30

Firewalls and NAT. Firewalls. firewall isolates organization s internal net from larger Internet, allowing some packets to pass, blocking others.

Some of the slides borrowed from the book Computer Security: A Hands on Approach by Wenliang Du. Firewalls. Chester Rebeiro IIT Madras

Linux IP Networking. Antonio Salueña

Linux System Administration, level 2

CCNA 1 Chapter 7 v5.0 Exam Answers 2013

CSCI-1680 Transport Layer I Rodrigo Fonseca

Introduction to TCP/IP networking

History Page. Barracuda NextGen Firewall F

Design and Performance of the OpenBSD Stateful Packet Filter (pf)

Building socket-aware BPF programs

Last Class. CSE 123b Communications Software. Today. Naming Processes/Services. Transmission Control Protocol (TCP) Picking Port Numbers.

NAT Router Performance Evaluation

Some slides courtesy David Wetherall. Communications Software. Lecture 4: Connections and Flow Control. CSE 123b. Spring 2003.

Certification. Securing Networks

Netfilter & Packet Dropping

CSE 461 The Transport Layer

Netfilter Connection Tracking and NAT Implementation

CSCI-GA Operating Systems. Networking. Hubertus Franke

TCP /IP Fundamentals Mr. Cantu

Transport Layer Marcos Vieira

Computer Security Spring Firewalls. Aggelos Kiayias University of Connecticut

Connections. Topics. Focus. Presentation Session. Application. Data Link. Transport. Physical. Network

Netfilter. Fedora Core 5 setting up firewall for NIS and NFS labs. June 2006

Configuring attack detection and prevention 1

CSCI-1680 Transport Layer I Rodrigo Fonseca

Firewalls. Firewall. means of protecting a local system or network of systems from network-based security threats creates a perimeter of defense

Accelerating Load Balancing programs using HW- Based Hints in XDP

Packet Filtering and NAT

CSE/EE 461 Lecture 12 TCP. A brief Internet history...

Computer and Network Security

Hashing on broken assumptions

Evaluating the performance of Netfilter architecture in Private Realm Gateway

Communications Software. CSE 123b. CSE 123b. Spring Lecture 10: Mobile Networking. Stefan Savage

Quick announcement. CSE 123b Communications Software. Last class. Today s issues. The Mobility Problem. Problems. Spring 2003

TCP: Transmission Control Protocol RFC 793,1122,1223. Prof. Lin Weiguo Copyleft 2009~2017, School of Computing, CUC

OSI Transport Layer. objectives

Configuring Flood Protection

CSE 4/589 Midterm Review. Hengtong Zhang SUNY Buffalo 10/30/2018

What is an L3 Master Device?

Network and Security: Introduction

SWITCHING, FORWARDING, AND ROUTING

Internet Protocol and Transmission Control Protocol

Video Streaming with the Stream Control Transmission Protocol (SCTP)

S 3 : the Small Scheme Stack A Scheme TCP/IP Stack Targeting Small Embedded Applications

Operating Systems. 17. Sockets. Paul Krzyzanowski. Rutgers University. Spring /6/ Paul Krzyzanowski

Introduction to Internet. Ass. Prof. J.Y. Tigli University of Nice Sophia Antipolis

Linux Networking: tcp. TCP context and interfaces

network security s642 computer security adam everspaugh

NAT Command Reference

CS 356: Computer Network Architectures. Lecture 14: Switching hardware, IP auxiliary functions, and midterm review. [PD] chapter 3.4.1, 3.2.

Configuring attack detection and prevention 1

Junos Security. Chapter 4: Security Policies Juniper Networks, Inc. All rights reserved. Worldwide Education Services

Computer Security and Privacy

Linux Systems Security. Firewalls and Filters NETS1028 Fall 2016

iptables and ip6tables An introduction to LINUX firewall

Transport Layer. Gursharan Singh Tatla. Upendra Sharma. 1

FireHOL Manual. Firewalling with FireHOL. FireHOL Team. Release pre3 Built 28 Oct 2013

What is Netfilter. Netfilter. Topics

Outline. What is TCP protocol? How the TCP Protocol Works SYN Flooding Attack TCP Reset Attack TCP Session Hijacking Attack

IP Packet. Deny-everything-by-default-policy

To get a feel for how to use the FIREWALL > Live page in NextGen Admin, watch the following video:

CS419: Computer Networks. Lecture 10, Part 2: Apr 11, 2005 Transport: TCP mechanics (RFCs: 793, 1122, 1323, 2018, 2581)

Examination 2D1392 Protocols and Principles of the Internet 2G1305 Internetworking 2G1507 Kommunikationssystem, fk SOLUTIONS

CSC 474/574 Information Systems Security

Layer 4: UDP, TCP, and others. based on Chapter 9 of CompTIA Network+ Exam Guide, 4th ed., Mike Meyers

CS457 Transport Protocols. CS 457 Fall 2014

TCP Review. Carey Williamson Department of Computer Science University of Calgary Winter 2018

COMP211 Chapter 4 Network Layer: The Data Plane

TCP : Fundamentals of Computer Networks Bill Nace

Persistence Schemes. Chakchai So-In Department of Computer science Washington University

Worksheet 8. Linux as a router, packet filtering, traffic shaping

Internet Layers. Physical Layer. Application. Application. Transport. Transport. Network. Network. Network. Network. Link. Link. Link.

while the LAN interface is in the DMZ. You can control access to the WAN port using either ACLs on the upstream router, or the built-in netfilter

Network Security Fundamentals

Life of a Packet. KubeCon Europe Michael Rubin TL/TLM in GKE/Kubernetes github.com/matchstick. logo. Google Cloud Platform

Unit 4: Firewalls (I)

Introduction to Internetworking

TCP/IP Networking. Part 4: Network and Transport Layer Protocols

CCNA R&S: Introduction to Networks. Chapter 7: The Transport Layer

QUIZ: Longest Matching Prefix

Our Narrow Focus Computer Networking Security Vulnerabilities. Outline Part II

Transcription:

netfilters connection tracking subsystem Florian Westphal 4096R/AD5FF600 fw@strlen.de 80A9 20C5 B203 E069 F586 AE9F 7091 A8D9 AD5F F600 Red Hat netdev 2.1, Montreal, April 2017

connection tracking flow tracking by addresses of endpoints (L3/L4, e.g. ip + port, ip + GRE call id,... ) split in layer 3 tracking (ip, ipv6) and layer 4 tracking (tcp, udp, sctp, ICMP,... ) L4 tracker is agnostic of lower protocol L4 trackers attempt to keep state, e.g. tcp: tracks state, checks sequence numbers. Example: new tcp packet? SYN bit set? tcp sequence number in expected window? unacknowledged data? adjust timeout rst? fin? delete connection and/or adjust timeout NAT is built on top conntrack itself never alters packets uses netfilter hooks to look at packets as they come in/leave

conntrack events userspace can subscribe to ct events: $ conntrack -E [UPDATE] tcp 6 432000 src=192.168.0.7 dst=10.. sport=3... [UPDATE] tcp 6 120 FIN_WAIT src=192.168.0.7 dst=10.16... [UPDATE] tcp 6 60 CLOSE_WAIT src=192.168.0.7 dst=10.16... [NEW] udp 17 30 src=10.26.2.2 dst=192.168.0.7 sport=5... [NEW] tcp 6 120 SYN_SENT src=192.168.0.7 dst=.. sport=6 [UPDATE] tcp 6 60 SYN_RECV src=192.168.0.7 dst=192.. [UPDATE] tcp 6 432000 ESTABLISHED src=192.168.0.7... [DESTROY] tcp 6 src=202:8071:.. dst=202:26f0.. sport=34284 NEW event sent once entry is placed in conntrack table it is possible to restrict what events are generated (CT target)

common misconceptions iptables -A INPUT -m conntrack --ctstate... doesn t do connection tracking... rather, it tests conntrack state (skb->nfct->status ==...) same for nft ct state...: no lookup of any kind conntrack doesn t look at socket states, only packets

conntrack states ESTABLISHED packet matches existing entry and l4 tracker checks pass NEW first packet of a connection (no previous record) a new connection entry is created after failed lookup... but NOT placed in main conntrack table... only done after packet traversed all hooks (iptables) in INPUT or POSTROUTING... in conntrack speak, the entry is now confirmed (in main table) RELATED same as NEW, except it somehow relates to another existing connection ICMP error, and the header inside matches an existing connection conntrack helper created an entry in the expectation table UNTRACKED packet was intentionally not tracked (ipv6 neigh discovery for instance) INVALID packet not seen or rejected by l4 trackers (skb->_nfct is 0)

connection tracking helpers some protocols are harder to track/nat, e.g. SIP or FTP kernel module monitors control channel, e.g. tcp port 21 can add expectations, i.e. if new connection is coming from S to D on port P, then mark as RELATED also can apply NAT if needed allows doing FTP, SIP etc. without opening up many ports or adding lots of 1:1 nat translations best-effort only, e.g. no tcp stream reassembly in kernel in-kernel XML/ASN.1 parsing required for sip, h323, etc. might be preferable to use real proxies its possible to add expectations from userspace e.g. could implement transparent SIP proxy that only processes call setup messages, and allows actual calls to just pass through

main conntrack table hash table, using rcu (lookups are lockless) and hashed locks (i.e. add/delete is parallel if they occur in other part of the table) table has a fixed size (net.netfilter.nf_conntrack_buckets) and fixed upper limit (net.netfilter.nf_conntrack_max) no automated growth, initial sizing based on available memory each entry is hashed twice (original+reply) to deal with nat

conntrack extensions idea: keep data of rarely used features outside of main nf_conn struct pro: don t have to allocate mem for rarely-used features con: overhead: 40 bytes per conntrack just for metadata need one extra deref to access data Examples: helper, counter, tstamp,...

NAT built on top of connection tracking NAT mappings are set up at conntrack creation time... which is why iptables nat table only sees first packet of flow one extra hash table: nat bysource table used to ensure addr:port is unique when adding new mapping all connections have nat mapping once a nat hook is active

overflow handling nf_conntrack: table full, dropping packet main assumption: most entries are non assured assured flag set by l4 protocol tracker at certain point (tcp: 3whs completed) over limit? 1. search up to 8 buckets for non-assured entry 2. destroy it and allocate new conntrack entry in its place otherwise, drop the new packet

problems Only non-assured entries can be early-dropped no way to know if new packet is more important than any other state table entry can t toss random entries: would kill valid connections doesn t play nice with nat/pat what about overflow w. legitimate traffic patterns?

suggestions (1) remove very strange conntrack error handling packet invalid? NF_ACCEPT (let user decide what to do in iptables ruleset) can t alloc conntrack/over limit? NF_DROP (user can t change this behavior) can this be fixed in a backwards-compatible fashion? doesn t solve table exhaustion problem for all cases, e.g. can t NAT non-tracked packets

suggestions (2) add early_drop function to the l4 trackers e.g. could prefer evicting tcp flow in WAIT state in favor of new connection TCP established default timeout is huge (5days) add soft timeout (min lifetime) sysctl, e.g. 5 minutes and allow fast-recycle after this do periodic ack probing/keepalives (i.e., elicit RST if connection was closed) adaptive timeouts like *BSD? Combine CT --timeout with match on (used) table size? early evict if no nat? problem: under flood, even 1 minute is too long helps with peers that don t close properly

conntrack summary mature code base lots of features but still room for improvements: overflow handling free extensions via kfree, not via rcu remove variable sized extensions?