Signaled Receiver Processing

Similar documents
receive call ip output input input input input input ip input queue ether input NW I/F driver HW interrupt

G Robert Grimm New York University

CSE398: Network Systems Design

Preview Test: HW3. Test Information Description Due:Nov. 3

Receive Livelock. Robert Grimm New York University

Lecture 8. Network Layer (cont d) Network Layer 1-1

Lecture 3. The Network Layer (cont d) Network Layer 1-1

Chapter 11. User Datagram Protocol (UDP)

Transport Layer. The transport layer is responsible for the delivery of a message from one process to another. RSManiaol

Linux IP Networking. Antonio Salueña

TSIN02 - Internetworking

Chapter 23 Process-to-Process Delivery: UDP, TCP, and SCTP 23.1

Novell TCP IP for Networking Professionals.

Objectives. Chapter 10. Upon completion you will be able to:

Lecture 2: Basic routing, ARP, and basic IP

Message Passing Architecture in Intra-Cluster Communication

Operating Systems. 17. Sockets. Paul Krzyzanowski. Rutgers University. Spring /6/ Paul Krzyzanowski

UNIT IV TRANSPORT LAYER

LRP: A New Network Subsystem Architecture for Server Systems. Rice University. Abstract

Fundamental Questions to Answer About Computer Networking, Jan 2009 Prof. Ying-Dar Lin,

Chapter 23 Process-to-Process Delivery: UDP, TCP, and SCTP

Resource Containers. A new facility for resource management in server systems. Presented by Uday Ananth. G. Banga, P. Druschel, J. C.

TSIN02 - Internetworking

Light & NOS. Dan Li Tsinghua University

The Network Stack (1)

Notes based on prof. Morris's lecture on scheduling (6.824, fall'02).

TSIN02 - Internetworking

Internet Control Message Protocol (ICMP)

Silberschatz and Galvin Chapter 12

L41 - Lecture 5: The Network Stack (1)

Policy Server and Policy Control Agent

ETSF05/ETSF10 Internet Protocols Network Layer Protocols

TCP/IP Stack Introduction: Looking Under the Hood!

L1/L2 NETWORK PROTOCOL TESTING

TSIN02 - Internetworking

Computer Security Spring Firewalls. Aggelos Kiayias University of Connecticut

CMSC 417 Project Implementation of ATM Network Layer and Reliable ATM Adaptation Layer

Firewalls. Firewall. means of protecting a local system or network of systems from network-based security threats creates a perimeter of defense

IsoStack Highly Efficient Network Processing on Dedicated Cores

CSCE 463/612 Networks and Distributed Processing Spring 2018

IPv4 Lecture 10a. COMPSCI 726 Network Defence and Countermeasures. Muhammad Rizwan Asghar. August 14, 2017

UDP: Datagram Transport Service

Internetworking With TCP/IP

Internetworking With TCP/IP

Transport Over IP. CSCI 690 Michael Hutt New York Institute of Technology

Memory-Mapped Files. generic interface: vaddr mmap(file descriptor,fileoffset,length) munmap(vaddr,length)

What is a Network? TCP / IP. The ISO OSI Model. Protocols. The TCP/IP Protocol Suite. The TCP/IP Protocol Suite. Computer network.

Master Course Computer Networks IN2097

Network Programming. Introduction to Sockets. Dr. Thaier Hayajneh. Process Layer. Network Layer. Berkeley API

Chapter 10: Networking Advanced Operating Systems ( L)

Chapter 8: I/O functions & socket options

Datagram. Source IP address. Destination IP address. Options. Data

Input/Output Systems

Networks. Other Matters: draft Assignment 2 up (Labs 7 & 8 v. important!!) Ref: [Coulouris&al Ch 3, 4] network performance and principles

Advanced Computer Networks. End Host Optimization

CSCD 330 Network Programming

Goals and topics. Verkkomedian perusteet Fundamentals of Network Media T Circuit switching networks. Topics. Packet-switching networks

TCP/IP and the OSI Model

Master Course Computer Networks IN2097

Intranets 4/4/17. IP numbers and Hosts. Dynamic Host Configuration Protocol. Dynamic Host Configuration Protocol. CSC362, Information Security

On Distributed Communications, Rand Report RM-3420-PR, Paul Baran, August 1964

08:End-host Optimizations. Advanced Computer Networks

Ref: A. Leon Garcia and I. Widjaja, Communication Networks, 2 nd Ed. McGraw Hill, 2006 Latest update of this lecture was on

internet technologies and standards

CSE398: Network Systems Design

Transport Layer. Gursharan Singh Tatla. Upendra Sharma. 1

CS370 Operating Systems

5105: BHARATHIDASAN ENGINEERING COLLEGE NATTARMPALLI UNIT I FUNDAMENTALS AND LINK LAYER PART A

Lecture 16: Network Layer Overview, Internet Protocol

OSI Network Layer. Network Fundamentals Chapter 5. Version Cisco Systems, Inc. All rights reserved. Cisco Public 1

Outline. What is TCP protocol? How the TCP Protocol Works SYN Flooding Attack TCP Reset Attack TCP Session Hijacking Attack

Lecture 2: Layering & End-to-End

CHAPTER 3 GRID MONITORING AND RESOURCE SELECTION

Design and Performance of the OpenBSD Stateful Packet Filter (pf)

Unit 2.

Department of Computer Science and Engineering. COSC 4213: Computer Networks II (Fall 2005) Instructor: N. Vlajic Date: November 3, 2005

Chapter 13: I/O Systems

The Client Server Model and Software Design

Introduction. Application Performance in the QLinux Multimedia Operating System. Solution: QLinux. Introduction. Outline. QLinux Design Principles

IT 341: Introduction to System

NetBSD Kernel Topics:

Layer 4: UDP, TCP, and others. based on Chapter 9 of CompTIA Network+ Exam Guide, 4th ed., Mike Meyers

CMPE 80N: Introduction to Networking and the Internet

Network Layer/IP Protocols

Concurrent Architectures - Unix: Sockets, Select & Signals

CSC 4900 Computer Networks: Network Layer

DISTRIBUTED COMPUTER SYSTEMS

The Cypress Link Level Protocol

Transport Layer Overview

The Transmission Control Protocol (TCP)

Chapter 24. Transport-Layer Protocols

CC231 Introduction to Networks Dr. Ayman A. Abdel-Hamid. Internet Protocol Suite

Computer and Network Security

T01IPnnn - Internet Layer Messages

Chapter 4 Software-Based IP Access Control Lists (ACLs)

Advanced Network Design

Measuring MPLS overhead

c. If the sum contains a zero, the receiver knows there has been an error.

User Datagram Protocol(UDP)

Outline. What is the problem? Receiving a packet in BSD. Chapter 10: Networking Advanced Operating Systems ( L) Why does networking matter?

Transcription:

Signaled Receiver Processing José Brustoloni, Eran Gabber and Avi Silberschatz GBN 99 1 Signaled Receiver Processing (SRP) is a new scheme for input protocol implementation that makes the system less susceptible to denial of service attacks and can improve application performance under high receive loads. 1

Receiver processing in BSD Unix HW interrupt NW I/F driver ether_input: place packet in ipintrq queue cause SW interrupt SW interrupt ip_input: checksum header preliminary packet processing (firewall, NAT, IP options) check destination IP address - own: reassemble; pass to higher-level protocol - multicast: multicast forward; pass to higher-level protocol - other: forward {tcp, udp}_input: checksum packet; find PCB; append packet to socket receive queue wakeup processes waiting for socket Receiving process: wait for data in socket receive queue GBN 99 2 Most current systems process input protocols at interrupt level. On BSD Unix, for example, a hardware interrupt happens when a packet arrives. The network interface driver passes the packet to ether_input, which places the packet in IP s input queue and causes a software interrupt. The system handles the software interrupt by dequeueing each packet and calling ip_input. Ip_input checksums the packet header and may submit the packet to preliminary processing, including firewall, network address translation (NAT), and IP options. The preliminary processing may drop, modify, or forward the packet. Ip_input then checks the packet s destination IP address. If that address is the same as the host s, ip_input reassembles the packet and passes it to a higher-level protocol, as indicated in the packet s header (TCP, UDP, ICMP, IGMP, RSVP, IPIP, RAW). If the destination address is a multicast address, ip_input may submit the packet for multicast forwarding and also pass the packet to a higher-level protocol, for local delivery. In the remaining cases, ip_input may submit the packet for IP forwarding. TCP s and UDP s input routines checksum the packet, find the protocol control block (PCB) according to the packet s header (in particular, the destination port), append the packet to the respective socket receive queue, and wakeup processes waiting for the socket (or drop the packet if queue is full). On receive calls, the receiving application waits for data in the socket s receive queue (may block) and, after that, dequeues the data and copies it out. 2

Problems in BSD receiver processing BSD receiver processing: Unscheduled (HW or SW interrupt - priority higher than any application) Occurs before determining if socket receive queue full Charged to interrupted application (wrongly) Receive livelock possible No QoS guarantees possible GBN 99 3 The previous slide shows that BSD Unix processes all input protocols as a hardware or software interrupt, at a priority higher than that of any application. If the receive load is sufficiently high, the system may spend all of its time processing input protocols, and applications cannot make forward progress, a situation known as receive livelock. Additionally, input protocol processing occurs even if the respective socket receive queue is already full. The CPU spent in such cases is wasted, since the packet will need to be dropped. Finally, the time spent on input protocol processing is charged to the interrupted process, even if that process is unrelated to the incoming packet. Such incorrect accounting can make the CPU scheduler incapable of providing quality of service guarantees. 3

Lazy Receiver Processing (LRP) NW I/F HW or SW EDMX : determine channel from packet header enqueue packet in channel queue (± per-socket) HW interrupt Wakeup processes waiting for channel UDP Receiving process: while not enough data in socket receive queue wait for packet in channel dequeue packet and call ip_input TCP Kernel thread (at receiving process s priority): wait for packet in channel dequeue packet and call ip_input Receiving process: wait for data in socket receive queue GBN 99 4 Lazy receiver processing (LRP) has been proposed for solving the problems mentioned on the previous slide. In LRP, the network interface hardware or its driver perform early demultiplexing, that is, examine the packet header to determine the corresponding channel (each socket has an associated channel). The system then enqueues the packet in that channel s queue and wakes up the processes waiting for it (or drops the packet if the queue is full). On UDP receive calls, the receiving process performs the following loop while there is not enough data in the socket receive queue: wait for packet in channel (may block) and then dequeue the packet and call ip_input, which eventually causes the packet to be enqueued in the socket receive queue. The receiving process then dequeues the data and copies it out. The same scheme would not work well for TCP because, for best throughput, the receiver has to send acknowledgements promptly, asynchronously from the application s receive calls. LRP therefore uses a kernel thread that is scheduled at the receiving process s priority and continuously waits for a packet in the channel, dequeues the packet, and calls ip_input. On TCP receive calls, the receiving process can then simply wait for data in the socket receive queue (which may block), dequeue the data, and copy it out. 4

Difficulties in using LRP What if kernel threads are not available (e.g., FreeBSD)? What if an application is busy and would like to defer TCP/IP input processing? What about preliminary packet processing (e.g., firewall, NAT)? Should IP forwarding be penalized for its CPU consumption? GBN 99 5 In LRP, most input protocol processing is scheduled at the receiving application s priority, preventing receive livelock and other scheduling anomalies. However, LRP can also present some difficulties. First, several contemporary systems (e.g., FreeBSD) do not support kernel threads, which are required by LRP. Second, even if an application is busy, it cannot defer its TCP input protocol processing, which happens in a separate kernel thread. Third, LRP s early demultiplexing ignores preliminary processing, such as firewall and NAT. Such processing may modify the packet and require new demultiplexing. Fourth, LRP does not properly acknowledge that some system protocol processing, such as IP forwarding, is ill-suited for scheduling as a time-sharing process that gets penalized according to its CPU usage. 5

Signaled Receiver Processing (SRP) NW I/F HW or SW or after preliminary packet processing (e.g., firewall, NAT) REDMX: Determine PCB from (packet header, skip) Enqueue packet in PCB socket UIQ (unprocessed input queue) Wakeup processes waiting for socket or signal SIGUIQ to processes that have socket open Default SIGUIQ: dequeue packet and call ip_input (but: application can catch and defer SIGUIQ processing) Receiving process: dequeue packets and call ip_input wait for data in socket receive queue System protocol processing (e.g., firewall or IP forwarding): - On Eclipse/BSD, processes with minimum CPU guarantee - pseudo-sockets GBN 99 6 Signaled receiver processing (SRP) uses a reinvocable early demultiplexing function (REDMX), used by the network interface hardware or its driver and after each preliminary processing step (e.g., firewall or NAT). REDMX determines the PCB that corresponds to a packet based on the packet header and a skip argument, which expresses the preliminary processing steps already performed (if any). Each PCB points to a socket, and each socket has an unprocessed input queue (UIQ). The system enqueues each packet in the corresponding UIQ and wakes up processes waiting for the respective socket. If no processes are waiting, the system signals SIGUIQ to the processes that have the socket open. The default SIGUIQ action is to dequeue the packet and call ip_input. This causes input protocol processing to occur in the context of the receiving processes, but possibly asynchronously to the application s receive calls. The packet is then enqueued in the socket receive queue. However, any process can catch its SIGUIQ signals and, for example, defer input protocol processing until a subsequent receive call. On receive calls, the receiving process dequeues any packets from UIQ and calls ip_input. The process then waits for data in the socket receive queue (may block), dequeues the data, and copies it out. On Eclipse/BSD, system protocol processing, such as IP forwarding, occurs in the context of processes that have minimum CPU shares (e.g., 40%), guaranteed by the operating system. 6

Advantages of SRP No kernel threads needed Application can catch SIGUIQ and defer protocol processing REDMX supports preliminary protocol processing (e.g., firewall, NAT) Eclipse/BSD s CPU scheduler allows system protocol processing not to be penalized for CPU consumption GBN 99 7 Therefore, SRP provides the following advantages in addition to LRP s: No kernel threads are needed; Busy applications can defer protocol processing; and Full system functionality present in contemporary systems can be maintained, including preliminary protocol processing, such as firewall and NAT. SRP was developed for Eclipse/BSD, a system that provides quality of service guarantees. On Eclipse/BSD, SRP realizes the additional advantage: System protocol processing, such as IP forwarding, is appropriately scheduled. SRP can be useful in any systems (e.g., servers) subject to high receive loads. 7