G Robert Grimm New York University

Similar documents
Receive Livelock. Robert Grimm New York University

Notes based on prof. Morris's lecture on scheduling (6.824, fall'02).

218 J. C. Mogul and K. K. Ramakrishnan are relatively rare, as is the case with disks, which seldom interrupt more than a few hundred times per second

Announcements. Reading. Project #1 due in 1 week at 5:00 pm Scheduling Chapter 6 (6 th ed) or Chapter 5 (8 th ed) CMSC 412 S14 (lect 5)

Device-Functionality Progression

Chapter 12: I/O Systems. I/O Hardware

Module 12: I/O Systems

Announcements. Program #1. Program #0. Reading. Is due at 9:00 AM on Thursday. Re-grade requests are due by Monday at 11:59:59 PM.

Chapter 13: I/O Systems

Much Faster Networking

Scheduling of processes

Chapter 13: I/O Systems

[08] IO SUBSYSTEM 1. 1

CPU Scheduling. CSE 2431: Introduction to Operating Systems Reading: Chapter 6, [OSC] (except Sections )

Chapter 13: I/O Systems

Chapter 13: I/O Systems. Chapter 13: I/O Systems. Objectives. I/O Hardware. A Typical PC Bus Structure. Device I/O Port Locations on PCs (partial)

by I.-C. Lin, Dept. CS, NCTU. Textbook: Operating System Concepts 8ed CHAPTER 13: I/O SYSTEMS

A Predictable RTOS. Mantis Cheng Department of Computer Science University of Victoria

The control of I/O devices is a major concern for OS designers

Module 12: I/O Systems

Priority Traffic CSCD 433/533. Advanced Networks Spring Lecture 21 Congestion Control and Queuing Strategies

Operating Systems. Scheduling

Quality of Service in the Internet

Scheduling Mar. 19, 2018

Quality of Service in the Internet. QoS Parameters. Keeping the QoS. Leaky Bucket Algorithm

Input/Output Systems

Announcements. Program #1. Reading. Due 2/15 at 5:00 pm. Finish scheduling Process Synchronization: Chapter 6 (8 th Ed) or Chapter 7 (6 th Ed)

Scheduling. The Basics

Chapter 13: I/O Systems

Eliminating Receive Livelock in an Interrupt-driven Kernel

15: OS Scheduling and Buffering

Optimizing Performance: Intel Network Adapters User Guide

Signaled Receiver Processing

Introduction to Real-Time Communications. Real-Time and Embedded Systems (M) Lecture 15

Interactive Scheduling

Two Level Scheduling. Interactive Scheduling. Our Earlier Example. Round Robin Scheduling. Round Robin Schedule. Round Robin Schedule

II. Principles of Computer Communications Network and Transport Layer

Quality of Service in the Internet

Uniprocessor Scheduling. Basic Concepts Scheduling Criteria Scheduling Algorithms. Three level scheduling

Chapter 12: I/O Systems

Chapter 13: I/O Systems

Chapter 12: I/O Systems. Operating System Concepts Essentials 8 th Edition

CS370 Operating Systems

Uniprocessor Scheduling

Congestion. Can t sustain input rate > output rate Issues: - Avoid congestion - Control congestion - Prioritize who gets limited resources

Computer Systems Assignment 4: Scheduling and I/O

CSE398: Network Systems Design

Operating System Concepts Ch. 5: Scheduling

Configuring QoS CHAPTER

Chapter 13: I/O Systems. Operating System Concepts 9 th Edition

Lecture 13 Input/Output (I/O) Systems (chapter 13)

Course Syllabus. Operating Systems

Uniprocessor Scheduling. Aim of Scheduling

Uniprocessor Scheduling. Aim of Scheduling. Types of Scheduling. Long-Term Scheduling. Chapter 9. Response time Throughput Processor efficiency

Quality of Service (QoS)

Chapter 9: Virtual Memory. Operating System Concepts 9 th Edition

Stanford University Computer Science Department CS 240 Sample Quiz 2 Answers Winter February 25, 2005

COSC243 Part 2: Operating Systems

Chapter 13: I/O Systems

Reducing SpaceWire Time-code Jitter

Comp 310 Computer Systems and Organization

OS Extensibility: SPIN and Exokernels. Robert Grimm New York University

Operating System Review Part

8: Scheduling. Scheduling. Mark Handley

Basics (cont.) Characteristics of data communication technologies OSI-Model

CHAPTER 3 GRID MONITORING AND RESOURCE SELECTION

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

I/O AND DEVICE HANDLING Operating Systems Design Euiseong Seo

Resource allocation in networks. Resource Allocation in Networks. Resource allocation

Ref: Chap 12. Secondary Storage and I/O Systems. Applied Operating System Concepts 12.1

Multicast and Quality of Service. Internet Technologies and Applications

Multi-Level Feedback Queues

521262S Computer Networks 2 (fall 2007) Laboratory exercise #4: Multimedia, QoS and testing

Last Class: Processes

A Better-Than-Best Effort Forwarding Service For UDP

I/O Handling. ECE 650 Systems Programming & Engineering Duke University, Spring Based on Operating Systems Concepts, Silberschatz Chapter 13

Network performance. slide 1 gaius. Network performance

CPU Scheduling. The scheduling problem: When do we make decision? - Have K jobs ready to run - Have N 1 CPUs - Which jobs to assign to which CPU(s)

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Scheduling

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 19: Networks and Distributed Systems

Chapter 6 Queuing Disciplines. Networking CS 3470, Section 1

1.1 CPU I/O Burst Cycle

Scheduling Algorithm and Analysis

Computer Networks. Routing

Unix SVR4 (Open Solaris and illumos distributions) CPU Scheduling

I/O. CS 416: Operating Systems Design Department of Computer Science Rutgers University

Chapter 5: CPU Scheduling

CS519: Computer Networks. Lecture 5, Part 4: Mar 29, 2004 Transport: TCP congestion control

Lecture 5: Performance Analysis I

Operating System: Chap13 I/O Systems. National Tsing-Hua University 2016, Fall Semester

Silberschatz and Galvin Chapter 12

Input/Output Systems

OPERATING SYSTEMS CS3502 Spring Processor Scheduling. Chapter 5

Preview. Process Scheduler. Process Scheduling Algorithms for Batch System. Process Scheduling Algorithms for Interactive System

Scheduling Bits & Pieces

Operating Systems. Lecture Process Scheduling. Golestan University. Hossein Momeni

CSE 120 Principles of Operating Systems Spring 2017

Chapter 8: Virtual Memory. Operating System Concepts

CSC Operating Systems Spring Lecture - XIX Storage and I/O - II. Tevfik Koşar. Louisiana State University.

RAID Structure. RAID Levels. RAID (cont) RAID (0 + 1) and (1 + 0) Tevfik Koşar. Hierarchical Storage Management (HSM)

Transcription:

G22.3250-001 Receiver Livelock Robert Grimm New York University

Altogether Now: The Three Questions What is the problem? What is new or different? What are the contributions and limitations?

Motivation Interrupts work well when I/O events are rare Think disk I/O In comparison, polling is expensive After all, CPU doesn t really do anything when polling To achieve same latency as with interrupts need to poll tens of thousands of times per second But, the world has changed: It s all about networking Multimedia, host-based routing, network monitoring, NFS, multicast, broadcast all lead to higher interrupt rates Once the interrupt rate is too high, system becomes overloaded and eventually makes no progress

Avoiding Receive Livelock Hybrid design Poll when triggered by interrupt Interrupt only when polling is suspended Result Low latency under low loads High throughput under high loads Additional techniques Drop packets early (those with the least investment) Connect with scheduler (give resources to user tasks)

Requirements for Scheduling Network Tasks Acceptable throughput Keep up with Maximum Loss Free Receive Rate (MLFRR) Keep transmitting as you keep receiving Reasonable latency, low jitter Avoid long queues Fair allocation of resources Continue to service management and control tasks Overall system stability Do not impact other systems on the network Livelock may look like a link failure, lead to more control traffic

Interrupt-Driven Scheduling: Packet Arrival Packet arrival signaled through an interrupt Associated with fixed Interrupt Priority Level (IPL) Handled by device driver Placed into queue, dropped if queue is full Protocol processing initiated by software interrupt Associated with lower IPL Packet processing may be batched Driver processes many packets before returning Gives absolute priority to incoming packets But modern systems have larger network card buffers, DMA

Interrupt-Driven Scheduling: Receive Livelock If packets arrive too fast, system spends most time processing receiver interrupts After all, they have absolute priority No resources left to deliver packets to applications After reaching MLFRR, throughput begins to fall again Eventually reaches 0 (!) But, doesn t batching help? Can increase MLFRR But cannot, by itself, avoid livelock

Interrupt-Driven Scheduling: Impact of Overload Packet delivery latency increases Packets arriving in bursts are processed in bursts Link-level processing: copy into kernel buffer and queue Dispatch: queue for delivery to user process Deliver: schedule user process Transmits may starve Transmission usually performed at lower IPL than reception Why do we need interrupts for transmission? Don t we just write the data to the interface and say transmit? But system is busy servicing packet arrivals

Better Scheduling Limit interrupt arrival rate to prevent saturation If internal queue is full, disable receive interrupts For the entire system? Re-enable interrupts once buffer space becomes available or after timeout Track time spent in interrupt handler If larger than specified fraction of total time, disable interrupts Alternatively, sample CPU state on clock interrupts When to use this alternative? Why does it work?

Better Scheduling (cont.) Use polling to provide fairness Query all sources of packet events round-robin Integrate with interrupts Reflects duality of approaches Polling works well for predictable behavior: high or over- load Interrupts work well for unpredictable behavior: light or regular load Avoid preemption to ensure progress Do most work at high IPL Do hardly any work at high IPL Integrates better with rest of kernel Sets service needed flag and schedules polling thread Gets rid of what?

Livelock in BSD-Based Routers: Experimental Setup IP packet router built on Digital Unix (DEC OSF/1) Bridges between two (otherwise unloaded) Ethernets Runs on DECstation 3000/300 running Digital Unix 3.2 Slowest available Alpha host (around 1996) Load generator sends 10,000 UDP packets 4 bytes of data per packet

Livelock in BSD-Based Routers: Unmodified Kernel With screend, peak at 2000 psec, livelock at 6000 psec Without, peak at 4700 psec, livelock at 14,880 psec

Livelock in BSD-Based Routers: Unmodified Kernel in Detail Packets only discarded after considerable processing I.e., copying (into kernel buffer) and queueing (into ipintrq)

Livelock in BSD-Based Routers: The Modified Path Modified path Where are packets dropped and how? Why retain the transmit queue?

Forwarding Performance Without screend Why do we need quotas? Why is the livelock worse for the modified kernel than for the original version?

Forwarding Performance With screend Why is polling not enough? What additional change is made?

Effect of Packet-Count Quotas Without screend With screend What causes the difference?

What About Other User-Space Tasks? So far, they don t get any cycles Why? Solution: Track cycles spent in polling thread Disable input handling if over threshold

Diggin Real Deep: Kernel Traces For 3 Packet Burst What s wrong with this picture? How might they fix the problem? What is the trade-off here and why?

Another Application: Network Monitoring What is different from the previous application? Where are the MLFRR and the saturation point?

What Do You Think?