Receive Livelock. Robert Grimm New York University

Similar documents
G Robert Grimm New York University

218 J. C. Mogul and K. K. Ramakrishnan are relatively rare, as is the case with disks, which seldom interrupt more than a few hundred times per second

Notes based on prof. Morris's lecture on scheduling (6.824, fall'02).

Announcements. Reading. Project #1 due in 1 week at 5:00 pm Scheduling Chapter 6 (6 th ed) or Chapter 5 (8 th ed) CMSC 412 S14 (lect 5)

Announcements. Program #1. Program #0. Reading. Is due at 9:00 AM on Thursday. Re-grade requests are due by Monday at 11:59:59 PM.

Module 12: I/O Systems

Scheduling of processes

Device-Functionality Progression

Chapter 12: I/O Systems. I/O Hardware

II. Principles of Computer Communications Network and Transport Layer

Chapter 13: I/O Systems

CPU Scheduling. CSE 2431: Introduction to Operating Systems Reading: Chapter 6, [OSC] (except Sections )

Module 12: I/O Systems

The control of I/O devices is a major concern for OS designers

Eliminating Receive Livelock in an Interrupt-driven Kernel

A Predictable RTOS. Mantis Cheng Department of Computer Science University of Victoria

Announcements. Program #1. Reading. Due 2/15 at 5:00 pm. Finish scheduling Process Synchronization: Chapter 6 (8 th Ed) or Chapter 7 (6 th Ed)

Chapter 13: I/O Systems

Much Faster Networking

Chapter 13: I/O Systems

Chapter 13: I/O Systems. Chapter 13: I/O Systems. Objectives. I/O Hardware. A Typical PC Bus Structure. Device I/O Port Locations on PCs (partial)

Operating Systems. Scheduling

Priority Traffic CSCD 433/533. Advanced Networks Spring Lecture 21 Congestion Control and Queuing Strategies

[08] IO SUBSYSTEM 1. 1

by I.-C. Lin, Dept. CS, NCTU. Textbook: Operating System Concepts 8ed CHAPTER 13: I/O SYSTEMS

15: OS Scheduling and Buffering

Quality of Service in the Internet

Quality of Service in the Internet. QoS Parameters. Keeping the QoS. Leaky Bucket Algorithm

Ref: Chap 12. Secondary Storage and I/O Systems. Applied Operating System Concepts 12.1

Uniprocessor Scheduling

CS370 Operating Systems

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

Input/Output Systems

Chapter 5: CPU Scheduling

Last Class: Processes

Resource allocation in networks. Resource Allocation in Networks. Resource allocation

Quality of Service in the Internet

Chapter 13: I/O Systems

Uniprocessor Scheduling. Aim of Scheduling

Uniprocessor Scheduling. Aim of Scheduling. Types of Scheduling. Long-Term Scheduling. Chapter 9. Response time Throughput Processor efficiency

Lecture 5 / Chapter 6 (CPU Scheduling) Basic Concepts. Scheduling Criteria Scheduling Algorithms

Operating System Concepts Ch. 5: Scheduling

Congestion. Can t sustain input rate > output rate Issues: - Avoid congestion - Control congestion - Prioritize who gets limited resources

Scheduling. The Basics

521262S Computer Networks 2 (fall 2007) Laboratory exercise #4: Multimedia, QoS and testing

Optimizing Performance: Intel Network Adapters User Guide

Operating Systems. Lecture Process Scheduling. Golestan University. Hossein Momeni

Advanced Operating Systems (CS 202) Scheduling (1)

Chapter 12: I/O Systems

Chapter 13: I/O Systems

Chapter 12: I/O Systems. Operating System Concepts Essentials 8 th Edition

Scheduling Mar. 19, 2018

Introduction to Real-Time Communications. Real-Time and Embedded Systems (M) Lecture 15

Course Syllabus. Operating Systems

Lecture Topics. Announcements. Today: Uniprocessor Scheduling (Stallings, chapter ) Next: Advanced Scheduling (Stallings, chapter

OS Extensibility: SPIN and Exokernels. Robert Grimm New York University

Interactive Scheduling

Two Level Scheduling. Interactive Scheduling. Our Earlier Example. Round Robin Scheduling. Round Robin Schedule. Round Robin Schedule

Comp 310 Computer Systems and Organization

Processes and Threads

Lecture 5: Performance Analysis I

Uniprocessor Scheduling. Basic Concepts Scheduling Criteria Scheduling Algorithms. Three level scheduling

Multi-Level Feedback Queues

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition

CS370 Operating Systems

What s An OS? Cyclic Executive. Interrupts. Advantages Simple implementation Low overhead Very predictable

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Scheduling

Preview. Process Scheduler. Process Scheduling Algorithms for Batch System. Process Scheduling Algorithms for Interactive System

Congestion control in TCP

Page Replacement Algorithms

Chapter 13: I/O Systems. Operating System Concepts 9 th Edition

Lecture 13 Input/Output (I/O) Systems (chapter 13)

Network performance. slide 1 gaius. Network performance

CSE398: Network Systems Design

Configuring QoS CHAPTER

Chapter 13: I/O Systems

Stanford University Computer Science Department CS 240 Sample Quiz 2 Answers Winter February 25, 2005

CSE 120 Principles of Operating Systems Spring 2017

CS519: Computer Networks. Lecture 5, Part 4: Mar 29, 2004 Transport: TCP congestion control

OPERATING SYSTEMS CS3502 Spring Processor Scheduling. Chapter 5

Processes, Execution, and State. What is CPU Scheduling? Goals and Metrics. Rectified Scheduling Metrics. CPU Scheduling: Proposed Metrics 4/6/2016

Signaled Receiver Processing

I/O AND DEVICE HANDLING Operating Systems Design Euiseong Seo

COSC243 Part 2: Operating Systems

Quality of Service (QoS)

8: Scheduling. Scheduling. Mark Handley

CHAPTER 3 GRID MONITORING AND RESOURCE SELECTION

Chap 7, 8: Scheduling. Dongkun Shin, SKKU

Chapter 9: Virtual Memory. Operating System Concepts 9 th Edition

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective. Part I: Operating system overview: Processes and threads

Operating Systems. Process scheduling. Thomas Ropars.

Basics (cont.) Characteristics of data communication technologies OSI-Model

I/O Handling. ECE 650 Systems Programming & Engineering Duke University, Spring Based on Operating Systems Concepts, Silberschatz Chapter 13

Multicast and Quality of Service. Internet Technologies and Applications

Basic Reliable Transport Protocols

Chapter 8: Virtual Memory. Operating System Concepts

Chapter 9: Virtual Memory

Operating System Review Part

Overview Computer Networking What is QoS? Queuing discipline and scheduling. Traffic Enforcement. Integrated services

A Better-Than-Best Effort Forwarding Service For UDP

Transcription:

Receive Livelock Robert Grimm New York University

The Three Questions What is the problem? What is new or different? What are the contributions and limitations?

Motivation Interrupts work well when I/O events are rare Think disk I/O By comparison, polling is expensive After all, CPU doesn t really do anything useful when polling To achieve same latency as interrupts, need to poll thousands of times per second But, the world has changed: it s all about networking Multimedia, host-based routing, network monitoring, NFS, multicast, broadcast all lead to higher interrupt rates Once interrupt rate is too high, system becomes overloaded and eventually makes no progress

Avoiding Receive Livelock Hybrid design Poll when triggered by interrupt Interrupt only when polling is suspended Result Low latency under low loads High throughput under high loads Additional techniques Drop packets early (i.e., those with least investment) Connect with scheduler (i.e., give resources to user tasks)

Requirements Acceptable throughput Keep up with Maximum Loss Free Receive Rate (MLFRR) Keep transmitting as you are receiving Reasonable latency, low jitter Avoid long queues Fair allocation of resources Continue to service management and control tasks Overall system stability Do not impact other systems on the network Livelock may look like link failure, lead to more control traffic

Interrupts: Packet Arrival Packet arrival signaled through an interrupt Associated with fixed Interrupt Priority Level (IPL) Handled by device driver Placed into queue, dropped if queue is full Protocol processing initiated by software interrupt Associated with lower IPL May be batched: process several packets before returning Gives absolute priority to incoming packets But modern systems have large network card buffers, DMA

Interrupts: Receive Livelock If packets arrive too fast, system spends most time servicing packet received interrupts After all, they have absolute priority No resources left to deliver packets to applications After reaching MLFRR, throughput begins to fall again Eventually reaches (!) But, doesn t batching help? Can increase MLFRR But cannot, by itself, avoid livelock

Interrupts: Overload Impact Packet delivery latency increases Packets arriving in bursts are processed in bursts Link-level processing: copy into kernel buffer and queue Dispatch: queue for delivery to user process Delivery: schedule user process Transmits may starve Transmission usually performed at lower IPL than reception Why do we need interrupts for transmission? Don t we just write the data to the interface and say transmit? But system is busy servicing packet arrivals

Better Scheduling Limit interrupt arrival rate to prevent saturation If internal queue is full, disable receive interrupts For the entire system? Re-enable interrupts once buffer space becomes available or after timeout Track time spent in interrupt handler If larger than specified fraction of total time, disable interrupts Alternatively, sample CPU state on clock interrupts When to use this alternative? Why does it work?

Better Scheduling (cont.) Use polling to provide fairness Query all sources of packet events round-robin Integrate with interrupts Reflects duality of approaches Polling for predictable events, interrupts for unpredictable events Avoid preemption to ensure progress Do most work at high IPL Do hardly any work at high IPL Integrates better with rest of kernel Sets service needed flag and schedules polling thread Gets rid of what?

Experimental Setup IP packet router built on Digital Unix (DEC OSF/1) Bridges between two (otherwise unloaded) ethernets Runs on DECstation 3/3 running Digital Unix 3.2 Slowest Alpha-based host available (around 1996) Load generator send 1, UDP packets 4 bytes of data per packet

Unmodified Kernel 5 Output packet rate (pkts/sec) 4 3 2 1 With screend Without screend 2 4 6 8 1 12 Input packet rate (pkts/sec) With screend, peak at 2 psec, livelock at 6 psec Without, peak at 47 psec, livelock at 14,88 psec

Unmodified Kernel in Detail Receive interrupt handler Increasing interrupt priority level ipintrq IP forwarding layer output ifqueue Transmit interrupt handler Packets only discarded after considerable processing I.e., copying (into kernel buffer) and queueing (into ipintrq)

Modified Kernel Increasing interrupt priority level Modified receive interrupt handler Received packet processing (polled) Unmodified receive interrupt handler ipintrq IP forwarding layer Unmodified transmit interrupt handler output ifqueues Modified transmit interrupt handler Transmit packet processing (polled) Where are packets dropped and how? Why retain the transmit queue? Modified Path

Performance w/o screend Output packet rate (pkts/sec) 6 5 4 3 2 1 Polling (no quota) Polling (quota = 5) No polling Unmodified 2 4 6 8 1 12 Input packet rate (pkts/sec) Why do we need quotas? Why is the livelock worse for the modified kernel?

Performance w/ screend 3 Output packet rate (pkts/sec) 25 2 15 1 5 Polling w/feedback Polling, no feedback Unmodified 2 4 6 8 1 12 Input packet rate (pkts/sec) Why is polling not enough? What additional change is made?

Packet Count Quotas 6 Output packet rate (pkts/sec) quota = infinity quota = 1 packets quota = 2 packets quota = 1 packets quota = 5 packets 5 4 3 2 1 2 Without screend 5 4 Peak output rate (for any input rate) 3 2 Asymptotic output rate (for peak input rate) 5 4 6 8 Input packet rate (pkts/sec) 1 1 Polling quota 12 15 2 With screend 3 What causes the difference? Output packet rate (pkts/sec) Output packet rate (pkts/sec) 6 25 2 quota = infinity quota = 1 packets quota = 2 packets quota = 1 packets quota = 5 packets 15 1 5 2 4 6 8 Input packet rate (pkts/sec) 1 12

What About Other Tasks? So far, they don t get any cycles why? Solution: track cycles spent in polling thread 8 Disable input handling if over threshold Available CPU time (per cent) 7 6 5 4 3 2 1 threshold 25 % threshold 5 % threshold 75 % threshold 1 % 2 4 6 8 1 Input packet rate (pkts/sec)

Diggin Real Deep 16 14 12 lnput ether_output timer interrupt lnput Polling enabled Polling disabled Stack level 1 8 6 4 ether_output ether_output ether_output lnput lnput lnput lnput lnput 2 2 4 6 8 1 12 Time in usec What s wrong with this picture? How might you fix the problem, w/ what trade-off?

Network Monitoring tcpdump capture rate 1 8 6 4 2 Hypothetical loss-free rate /dev/null, with feedback disk file, with feedback /dev/null, no feedback disk file, no feedback 2 4 6 8 1 12 Input rate (packets per second) What is the difference from the previous application? Where are the MLFRR and the saturation point?

What Do You Think?