Linux IP Networking. Antonio Salueña

Similar documents
Operating Systems. 17. Sockets. Paul Krzyzanowski. Rutgers University. Spring /6/ Paul Krzyzanowski

What is Netfilter. Netfilter. Topics

Some of the slides borrowed from the book Computer Security: A Hands on Approach by Wenliang Du. Firewalls. Chester Rebeiro IIT Madras

Netfilter & Packet Dropping

UNIVERSITY OF OSLO Department of Informatics. Late data choice with the Linux TCP/IP stack. Master Thesis. Erlend Birkedal

Packet Aggregation in Linux

Netfilter Connection Tracking and NAT Implementation

Suricata IDPS and Nftables: The Mixed Mode

Networking Subsystem in Linux. Manoj Naik IBM Almaden Research Center

Using Time Division Multiplexing to support Real-time Networking on Ethernet

CSCI-GA Operating Systems. Networking. Hubertus Franke

Once Only Drop Capability in the Linux Routing Software

IPtables and Netfilter

UNIVERSITY OF OSLO Department of Informatics. Implementing Rate Control in NetEm. Untying the NetEm/tc tangle. Master thesis. Anders G.

Chapter 3 Internet Protocol Layer

COPYRIGHTED MATERIAL INTRODUCTION

What is an L3 Master Device?

jelly-near jelly-far

Real-Time Networking for Quality of Service on TDM based Ethernet

Chapter 5 Network Layer

Lecture 3. The Network Layer (cont d) Network Layer 1-1

Host Identifier and Local Locator for Mobile Oriented Future Internet: Implementation Perspective

EVASIVE INTERNET PROTOCOL: END TO END PERFORMANCE

19: Networking. Networking Hardware. Mark Handley

Transport Layer. The transport layer is responsible for the delivery of a message from one process to another. RSManiaol

Linux. Sirindhorn International Institute of Technology Thammasat University. Linux. Firewalls with iptables. Concepts. Examples

Signaled Receiver Processing

Implementing the Wireless Token Ring Protocol As a Linux Kernel Module

Message Passing Architecture in Intra-Cluster Communication

A 10 years journey in Linux firewalling Pass the Salt, summer 2018 Lille, France Pablo Neira Ayuso

Lecture 8. Network Layer (cont d) Network Layer 1-1

sottotitolo A.A. 2016/17 Federico Reghenzani, Alessandro Barenghi

netfilters connection tracking subsystem

The Linux Network Subsystem

Scalable Emulation of IP Networks through Virtualization

Chapter 10: I/O Subsystems (2)

MONSTER. Managing an Operator s Network with Software Defined Networking and Segment Routing. Ing. Luca Davoli

Chapter 4: network layer. Network service model. Two key network-layer functions. Network layer. Input port functions. Router architecture overview

Chapter 10: I/O Subsystems (2)

User Datagram Protocol

Ref: A. Leon Garcia and I. Widjaja, Communication Networks, 2 nd Ed. McGraw Hill, 2006 Latest update of this lecture was on

CS 378 (Spring 2003)

Linux Kernel Architecture

Networks and Operating Systems ( ) Chapter 10: I/O Subsystems (2)

Netfilter. Fedora Core 5 setting up firewall for NIS and NFS labs. June 2006

Outline. 1) Introduction to Linux Kernel 2) How system calls work 3) Kernel-space programming 4) Networking in kernel 2/34

Router Architecture Overview

... Application Note AN-531. PCI Express System Interconnect Software Architecture. Notes Introduction. System Architecture.

CSE506: Operating Systems CSE 506: Operating Systems

Our Small Quiz. Chapter 9: I/O Subsystems (2) Generic I/O functionality. The I/O subsystem. The I/O Subsystem.

The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication

Light & NOS. Dan Li Tsinghua University

Utilizing Linux Kernel Components in K42 K42 Team modified October 2001

What s an API? Do we need standardization?

ADRIAN PERRIG & TORSTEN HOEFLER ( ) 10: I/O

Got Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat. ACM SIGCOMM 2013, August, Hong Kong, China

An Approach for Network Forwarding Systems Quality

Accelerating Load Balancing programs using HW- Based Hints in XDP

Much Faster Networking

Chapter 3. Internet Protocol Layer

Introduction to Routers and LAN Switches

Our Small Quiz. Chapter 10: I/O Subsystems (2) Generic I/O functionality. The I/O subsystem. The I/O Subsystem. The I/O Subsystem

CSE398: Network Systems Design

Chapter 4 Network Layer: The Data Plane

Packet Sniffing and Spoofing

Linux Kernel forwarding. Masakazu Ginzado Co., Ltd.

Network Management & Monitoring Network Delay

On Distributed Communications, Rand Report RM-3420-PR, Paul Baran, August 1964

Internet Protocols (chapter 18)

Introduction to TCP/IP networking

Optimizing Performance: Intel Network Adapters User Guide

II. Principles of Computer Communications Network and Transport Layer

PCI Express System Interconnect Software Architecture for PowerQUICC TM III-based Systems

Lecture 16: Network Layer Overview, Internet Protocol

ICS 451: Today's plan

ICS 351: Networking Protocols

Network Layer (4): ICMP

A practical introduction to XDP

The Research and Application of Firewall based on Netfilter

Chapter 4 Network Layer

ECE4110 Internetwork Programming. Introduction and Overview

NDNLP: A Link Protocol for NDN

Shaping the Linux kernel MPTCP implementation towards upstream acceptance

Speeding up Linux TCP/IP with a Fast Packet I/O Framework

Interprocess Communication Mechanisms

shared storage These mechanisms have already been covered. examples: shared virtual memory message based signals

Lecture 2: Basic routing, ARP, and basic IP

The Network Layer and Routers

Performance Enhancement for IPsec Processing on Multi-Core Systems

Network Layer PREPARED BY AHMED ABDEL-RAOUF

Linux Kernel Application Interface

Chapter 4: Network Layer

1-1. Switching Networks (Fall 2010) EE 586 Communication and. October 25, Lecture 24

Memory mapped netlink

G Robert Grimm New York University

Network Security Fundamentals

CMSC 332 Computer Networks Network Layer

Lecture 4 - Network Layer. Transport Layer. Outline. Introduction. Notes. Notes. Notes. Notes. Networks and Security. Jacob Aae Mikkelsen

CSE398: Network Systems Design

CPSC 826 Internetworking. The Network Layer: Routing & Addressing Outline. The Network Layer

Transcription:

Linux IP Networking Antonio Salueña <saluena@lut.fi> Preface We will study linux networking for the following case: Intel x86 architecture IP packets Recent stable linux kernel series 2.4.x 2

Overview UDP 520 netlink routed/gated UDP Application Process BSD Socket INET Socket TCP TC netlink rtnetlink FIB input IP forward output backlog buffer Device Driver enqueue dequeue Traffic Control Elements NIC 3 Linux Network Layers BSD sockets layer provides standard UNIX sockets API. INET layer manages communication end points for the IP based protocols TCP and UDP. Device physical transmission. Application Process routed BSD sockets interface INET sockets interface TCP UDP IP device 4

Glossary Hardware Interrupt (Hardware IRQ). softirq one of the 32 software interrupts which can run on multiple CPU at once. tasklet a dinamicaly registrable software interrupt, which is guaranteed to only run on one CPU at time. Bottom Half like softirq but only one bh can be running at any time on all CPUs, deprecated. Software Interrupt (Software IRQ) general term for softirqs, tasklets and bottom halves. User Context kernel executing on behalf of a particular process or kernel thread. Can be interrupted by software or hardware interrupt. Userspace a process executing its own code outside the kernel. 5 Socket Buffer sk_buff The key structure of linux networking code: common packet data structure for all protocol layers. Contains pointers to all protocol headers and length field that allow each protocol layer to manipulate data via standard functions (methods). Data is copied only twice: from user space to kernel space from kernel space to output medium (in case of an outbound packet) sk_buff *next *previous *socket *th *uh *icmph *iph *arph *ethernet *raw packet data 6

Packet Reception: ISR sk_buff Data Handling netif_rx() Interrupt Handler (ISR) NIC hardware irq raise softirq If queue < 300 CPU backlog queue 7 Packet Reception: ISR Packet arrives on medium. NIC checks, stores packet issues hardware interrupt. Network driver for this card handles irq: DEV_rx(). Status checks, allocate sk_buff: dev_skb_alloc(). Packet data is put from the system bus to sk_buff. Protocol type determined: eth_type_trans(). Skb gets posted to network code incoming queue: netif_rx(). The card status is updated and ISR is finished. In netif_rx() if queue is congested (max_backlog_queue=300) packet is dropped, otherwise skb get enqueued and receive network softirq is raised. 8

Packet Reception: RX softirq Network RX softirq handler is called by scheduler Lock CPU device queue. Loop thru queue: dequeue skb find appropriate protocol handler call protocol handler; ip_rcv() in case of IPv4 If queue is empty or time slice expired raise network softirq again and exit loop. ip_rcv() Initial packet processing softirq handler scheduler 9 Packet Reception: IP Layer ip_local_deliver() FIB Lookup routines ip_rcv() ip_forward() 10

Packet Reception: IP Layer ip_rcv() checks packet header for correctness, failed packets are dropped. First netfilter hook (NF_IP_PRE_ROUTING). After successful netfilter traversal: ip_rcv_finish(). Packets destination is determined: ip_route_input() to access the FIB for route information. If the packet is for local host then it is passed to the upper layer: ip_local_deliver(). Otherwise ip_forward() is called to rebuild and forward the packet to next destination. If the routing error occurred: ip_error(). If it is a multicast packet and we have to do some multicast routing: ip_mr_input(). 11 Packet Reception: TCP Layer Ether tcp_rcv() or udp_rcv() is called to handle local packets. For the TCP protocol get_tcp_sock() is called to extract the port number and INET socket from the packet. tcp_data() is called to check that the packet is new data and discard duplicates. Finally, a hash lookup in the socket hash table is performed in order to forward the received packed to the correct INET socket. 12

Packet Reception: Sockets After the protocol layer have finished with the received packet, INET socket interface will pass it to the BSD socket interface. A function data_ready() will wake up any process that is waiting at the BSD socket for the arriving packet. 13 Packet Forwarding ip_forward() is called. Check TTL field (drop packet if TTL 1). Check for skb space for destination device link header and expand if nessesary. Decrement TTL by one. Drop if don t fragment bit is set and packet needs fragmentation. Send ICMP message back to sender if there any problems. Netfilter hook (NF_IP_FORWARD). If netfilter accepts packet: ip_forward_finish(). If we need to set additional ip options: ip_forward_options(). ip_send(); ip_fragment() if packet is bigger than the destination device MTU. ip_finish_output(): netfilter (NF_IP_POST_ROUTING). If accepted, ip_finish_output2() prepends link header to skb and calls ip_output(). 14

Packet Sending: Sockets Application send data to the other side by using write() C function. sock_write() decides the next function to be called according to the address family the socket is associated with. I.E.: In case of INET socket address family sock_write() calls inet_sendmsg(). Process send BSD socket sock_write INET socket inet_sendmsg 15 Packet Sending: TCP Layer tcp_sendmsg() checks for a number of error conditions. do_tcp_sendmsg() allocates memory to hold skb. build_header() initialize TCP header. tcp_send_skb() add a variety of protocolspecific information to the header. Call queue_xmit() in skb structure which points to ip_queue_xmit(). 16

Packet Transmission If the destination interface is up and running send the packet by dev_send_xmit(). Enqueue skb to the tail of the device output queue (priorities). Wake up device. Scheduler run device driver: hard_start_xmit(). Test the medium. Send the link header. Tell the bus to transmit the packet over the medium. 17 Netfilter Framework Netfilter is a framework for packet mangling for linux, outside the normal BSD sockets interface. Netfilter has three parts Each protocol defines hooks well-defined points in a packets traversal of that protocol stack (IPv4 defines 5, IPv6 and DECnet hooks are similar). Parts of the kernel can register to listen to the different hooks of each protocol (it is possible to examine, alter, discard, allow to pass or queue packet for userspace). Packets that have been queued are collected for sending to userspace by the ip_queue driver. 18

Netfilter Architecture: IP PRE_ROUTING ROUTE FORWARD POST_ROUTING Netfilter Verdicts NF_ACCEPT ROUTE NF_DROP NF_STOLEN NF_QUEUE NF_REPEAT LOCAL_IN LOCAL_OUT 19 Packet Selection Packet selection system called IP Tables has been build over netfilter framework. Kernel modules can register a new table and ask for packet to traverse a given table. filter table: hooks into local_in, forward and local_out points. For any given packet the one (and only one) place to filter it. nat table: network address translation table in pre_routing, post_routing and local_out. Netfilter implements connection tracking mechanism in separate module using local_out and pre_routing. 20

References 1. D. Karpov. Linux IP stack. Notes on a packet s traversal path trough the linux-2.4.0 network protocol stack. 2001. 2. M. Feng, R. Leung, A. Do-Sung Jun. Linux Networking Functions. 1999. 3. G. Herrin. Linux IP Networking. 2000. 4. H. Welte. Packet Jorney v1.4. <http://www.gnumonks.org/ftp/pub/doc/packetjourney-2.4.html>. 5. H. Welte. Skb Linux network buffers. 2000. <http://www.gnumonks.org/ftp/pub/doc/skbdoc.html>. 21