Outline. Overview. Linux-specific, since kernel 2.6.0

Similar documents
Like select() and poll(), epoll can monitor multiple FDs epoll returns readiness information in similar manner to poll() Two main advantages:

POSIX Shared Memory. Linux/UNIX IPC Programming. Outline. Michael Kerrisk, man7.org c 2017 November 2017

CSE 333 SECTION 3. POSIX I/O Functions

Outline. Relationship between file descriptors and open files

CSE 333 SECTION 3. POSIX I/O Functions

Any of the descriptors in the set {1, 4} have an exception condition pending

Processes. Johan Montelius KTH

A process. the stack

Lecture 8: Other IPC Mechanisms. CSC 469H1F Fall 2006 Angela Demke Brown

Topics. Lecture 8: Other IPC Mechanisms. Socket IPC. Unix Communication

Section 3: File I/O, JSON, Generics. Meghan Cowan

Lecture 3. Introduction to Unix Systems Programming: Unix File I/O System Calls

Goals of this Lecture

Programmation Systèmes Cours 4 IPC: FIFO

I/O Management! Goals of this Lecture!

I/O Management! Goals of this Lecture!

Operating Systems. VI. Threads. Eurecom. Processes and Threads Multithreading Models

Programmation Système Cours 4 IPC: FIFO

ECE 650 Systems Programming & Engineering. Spring 2018

NETWORK PROGRAMMING. Instructor: Junaid Tariq, Lecturer, Department of Computer Science

JENET C LIBRARY API ver. 3.00

CSC209H Lecture 10. Dan Zingaro. March 18, 2015

CISC2200 Threads Spring 2015

Windows architecture. user. mode. Env. subsystems. Executive. Device drivers Kernel. kernel. mode HAL. Hardware. Process B. Process C.

Recitation 8: Tshlab + VM

Asynchronous Events on Linux

Process Management! Goals of this Lecture!

CSC209H Lecture 11. Dan Zingaro. March 25, 2015

CSE 333 Lecture 16 - network programming intro

The course that gives CMU its Zip! I/O Nov 15, 2001

csdesign Documentation

PRINCIPLES OF OPERATING SYSTEMS

Standards / Extensions C or C++ Dependencies POSIX.1 XPG4 XPG4.2 Single UNIX Specification, Version 3

UNIX System Calls. Sys Calls versus Library Func

Creating a Shell or Command Interperter Program CSCI411 Lab

Important Dates. October 27 th Homework 2 Due. October 29 th Midterm

UNIVERSITY OF TORONTO SCARBOROUGH Computer and Mathematical Sciences. APRIL 2016 EXAMINATIONS CSCB09H3S Software Tools & Systems Programming

UNIT III - APPLICATION DEVELOPMENT. TCP Echo Server

CS 3113 Introduction to Operating Systems Midterm October 11, 2018

CS 3113 Introduction to Operating Systems Midterm October 11, 2018

Light: A Scalable, High-performance and Fully-compatible User-level TCP Stack. Dan Li ( 李丹 ) Tsinghua University

File I/O. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CSE 333 Section 8 - Client-Side Networking

Outline. Basic features of BPF virtual machine

PS3 Programming. Week 4. Events, Signals, Mailbox Chap 7 and Chap 13

C BOOTCAMP DAY 2. CS3600, Northeastern University. Alan Mislove. Slides adapted from Anandha Gopalan s CS132 course at Univ.

Outline. OS Interface to Devices. System Input/Output. CSCI 4061 Introduction to Operating Systems. System I/O and Files. Instructor: Abhishek Chandra

Memory management. Johan Montelius KTH

CS240: Programming in C. Lecture 14: Errors

All the scoring jobs will be done by script

File Descriptors and Piping

Operating Systems. Lecture 06. System Calls (Exec, Open, Read, Write) Inter-process Communication in Unix/Linux (PIPE), Use of PIPE on command line

CS 201. Files and I/O. Gerson Robboy Portland State University

CS C Primer. Tyler Szepesi. January 16, 2013

Computer Systems Assignment 2: Fork and Threads Package

Introduction to OS Processes in Unix, Linux, and Windows MOS 2.1 Mahmoud El-Gayyar

CSCE 313 Introduction to Computer Systems. Instructor: Dezhen Song

Week 13. Final exam review

SYSTEM CALL IMPLEMENTATION. CS124 Operating Systems Fall , Lecture 14

CS 326 Operating Systems C Programming. Greg Benson Department of Computer Science University of San Francisco

Reading Assignment 4. n Chapter 4 Threads, due 2/7. 1/31/13 CSE325 - Processes 1

CSPP System V IPC 1. System V IPC. Unix Systems Programming CSPP 51081

EL2310 Scientific Programming

CS370 Operating Systems

Exception-Less System Calls for Event-Driven Servers

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIX input and output

Hyo-bong Son Computer Systems Laboratory Sungkyunkwan University

Operating systems. Lecture 7

Workshop on Inter Process Communication Solutions

Memory. What is memory? How is memory organized? Storage for variables, data, code etc. Text (Code) Data (Constants) BSS (Global and static variables)

CSC 271 Software I: Utilities and Internals

ECE 435 Network Engineering Lecture 2

Overview. Last Lecture. This Lecture. Daemon processes and advanced I/O functions

File I/O. Dong-kun Shin Embedded Software Laboratory Sungkyunkwan University Embedded Software Lab.

libnetfilter_log Reference Manual

unsigned char memory[] STACK ¼ 0x xC of address space globals function KERNEL code local variables

CSE 509: Computer Security

Operating Systems CS 217. Provides each process with a virtual machine. Promises each program the illusion of having whole machine to itself

Signal Example 1. Signal Example 2

Radix Tree, IDR APIs and their test suite. Rehas Sachdeva & Sandhya Bankar

de facto standard C library Contains a bunch of header files and APIs to do various tasks

Structures, Unions Alignment, Padding, Bit Fields Access, Initialization Compound Literals Opaque Structures Summary. Structures

Interprocess Communication. Originally multiple approaches Today more standard some differences between distributions still exist

Preview. Process Control. What is process? Process identifier The fork() System Call File Sharing Race Condition. COSC350 System Software, Fall

Roadmap. CPU management. Memory management. Disk management. Other topics. Process, thread, synchronization, scheduling. Virtual memory, demand paging

CSE 410: Systems Programming

All the scoring jobs will be done by script

EL2310 Scientific Programming

CSE 333 Midterm Exam 2/12/16. Name UW ID#

Overview. Administrative. * HW 2 Grades. * HW 3 Due. Topics: * What are Threads? * Motivating Example : Async. Read() * POSIX Threads

Lecture 8: Unix Pipes and Signals (Feb 10, 2005) Yap

Programming in C. Pointers and Arrays

Maria Hybinette, UGA. ! One easy way to communicate is to use files. ! File descriptors. 3 Maria Hybinette, UGA. ! Simple example: who sort

OS COMPONENTS OVERVIEW OF UNIX FILE I/O. CS124 Operating Systems Fall , Lecture 2

ECE 435 Network Engineering Lecture 2

ECEN 449 Microprocessor System Design. Review of C Programming

This resource describes how to program the myrio in C to perform timer interrupts.

TPMC901-SW-95. QNX4 - Neutrino Device Driver. User Manual. The Embedded I/O Company. 6/4/2 Channel Extended CAN-Bus PMC

Week 2 Intro to the Shell with Fork, Exec, Wait. Sarah Diesburg Operating Systems CS 3430

Transcription:

Outline 25 Alternative I/O Models 25-1 25.1 Overview 25-3 25.2 Signal-driven I/O 25-9 25.3 I/O multiplexing: poll() 25-12 25.4 Problems with poll() and select() 25-29 25.5 The epoll API 25-32 25.6 epoll events 25-43 25.7 epoll: further API details 25-56 25.8 Appendix: I/O multiplexing with select() 25-63 Overview Like select() and poll(), epoll can monitor multiple FDs epoll returns readiness information in similar manner to poll() Two main advantages: epoll provides much better performance when monitoring large numbers of FDs epoll provides two notification modes: level-triggered and edge-triggered Default is level-triggered notification select() and poll() provide only level-triggered notification (Signal-driven I/O provides only edge-triggered notification) Linux-specific, since kernel 2.6.0 [TLPI 63.4] Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-34 25.5

epoll instances Central data structure of epoll API is an epoll instance Persistent data structure maintained in kernel space Referred to in user space via file descriptor Container for two information lists: Interest list: a list of FDs that a process is interested in monitoring Ready list: a list of FDs that are ready for I/O Membership of ready list is a (dynamic) subset of interest list Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-35 25.5 epoll APIs The key epoll APIs are: epoll_create(): create epoll instance and return FD referring to instance FD is used in the calls below epoll_ctl(): modify interest list of epoll instance Add FDs to/remove FDs from interest list Modify events mask for FDs currently in interest list epoll_wait(): return items from ready list of epoll instance Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-36 25.5

epoll kernel data structures and APIs User space File descriptor from epoll_create() refers to Populated/modified by calls to epoll_ctl() Kernel space epoll instance Interest list FD events data......... Populated by kernel based on interest list and I/O events Returned by calls to epoll_wait() Ready list FD events data......... Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-37 25.5 Creating an epoll instance: epoll_create() # include <sys / epoll.h> int epoll_create ( int size ); Creates an epoll instance; returns FD referring to instance size: Since Linux 2.6.8: serves no purpose, but must be > 0 Before Linux 2.6.8: an estimate of number of FDs to be monitored via this epoll instance Returns file descriptor on success, or -1 on error When FD is no longer required it should be closed via close() Since Linux 2.6.27, there is an improved API, epoll_create1() See the man page [TLPI 63.4.1] Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-38 25.5

Modifying the epoll interest list: epoll_ctl() # include <sys / epoll.h> int epoll_ctl ( int epfd, int op, int fd, struct epoll_event * ev ); Modifies the interest list associated with epoll FD, epfd fd: identifies which FD in interest list is to have its settings modified E.g., FD for pipe, FIFO, terminal, socket, POSIX MQ, or even another epoll FD (Can t be FD for a regular file or directory) op: operation to perform on interest list ev: (Later) Returns 0 on success, or -1 on error [TLPI 63.4.2] Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-39 25.5 epoll_ctl() op argument The epoll_ctl() op argument is one of: EPOLL_CTL_ADD: add fd to interest list of epfd ev specifies events to be monitored for fd If fd is already in interest list EEXIST EPOLL_CTL_MOD: modify settings of fd in interest list of epfd ev specifies new settings to be associated with fd If fd is not in interest list ENOENT EPOLL_CTL_DEL: remove fd from interest list of epfd ev is ignored If fd is not in interest list ENOENT Closing an FD automatically removes it from all epoll interest lists Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-40 25.5

The epoll_event structure epoll_ctl() ev argument is pointer to an epoll_event structure: struct epoll_event { uint32_t events ; /* epoll events ( bit mask ) */ epoll_data_t data ; /* User data */ }; typedef union epoll_data { void * ptr ; /* Pointer to user - defined data */ int fd; /* File descriptor */ uint32_t u32 ; /* 32 - bit integer */ uint64_t u64 ; /* 64 - bit integer */ } epoll_data_t ; ev.events: bit mask of events to monitor for fd (Similar to events mask given to poll()) data: info to be passed back to caller of epoll_wait() when fd later becomes ready Union field: value is specified in one of the members Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-41 25.5 Example: using epoll_create() and epoll_ctl() int epfd ; struct epoll_event ev; epfd = epoll_create (5); ev. data.fd = fd; ev. events = EPOLLIN ; /* Monitor for input available */ epoll_ctl ( epfd, EPOLL_CTL_ADD, fd, & ev ); Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-42 25.5

Outline 25 Alternative I/O Models 25-1 25.1 Overview 25-3 25.2 Signal-driven I/O 25-9 25.3 I/O multiplexing: poll() 25-12 25.4 Problems with poll() and select() 25-29 25.5 The epoll API 25-32 25.6 epoll events 25-43 25.7 epoll: further API details 25-56 25.8 Appendix: I/O multiplexing with select() 25-63 Waiting for events: epoll_wait() # include <sys / epoll.h> int epoll_wait ( int epfd, struct epoll_event * evlist, int maxevents, int timeout ); Returns info about ready FDs in interest list of epoll instance of epfd Info about ready FDs is returned in array evlist (Caller allocates this array) maxevents: size of the evlist array If > maxevents events are available, successive epoll_wait() calls round-robin through events [TLPI 63.4.3] Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-44 25.6

Waiting for events: epoll_wait() # include <sys / epoll.h> int epoll_wait ( int epfd, struct epoll_event * evlist, int maxevents, int timeout ); timeout specifies a timeout for call: -1: block until an FD in interest list becomes ready 0: perform a nonblocking poll to see if any FDs in interest list are ready > 0: block for up to timeout milliseconds or until an FD in interest list becomes ready Return value: > 0: number of items placed in evlist 0: no FDs became ready within interval specified by timeout -1: an error occurred [TLPI 63.4.3] Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-45 25.6 Waiting for events: epoll_wait() # include <sys / epoll.h> int epoll_wait ( int epfd, struct epoll_event * evlist, int maxevents, int timeout ); Info about multiple FDs can be returned in the array evlist Each element of evlist returns info about one file descriptor: events is a bit mask of events that have occurred for FD data is the ev.data value specified when the FD was registered with epoll_ctl() NB: the FD itself is not returned! Instead, we put FD into ev.data.fd when calling epoll_ctl(), so that it is returned via epoll_wait() (Or, put FD into a structure pointed to by ev.data.ptr) Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-46 25.6

epoll events ev.events value given to epoll_ctl() and evlist[].events fields returned by epoll_wait() are bit masks of events Input to Returned by Bit Description epoll_ctl()? epoll_wait()? EPOLLIN Normal-priority data can be read EPOLLPRI High-priority data can be read EPOLLRDHUP Shutdown on peer socket EPOLLOUT Data can be written EPOLLONESHOT Disable monitoring after event notification EPOLLET Employ edge-triggered notification EPOLLERR An error has occurred EPOLLHUP A hangup occurred With the exception of EPOLLOUT and EPOLLET, these bit flags have the same meaning as the similarly named poll() bit flags [TLPI 63.4.3] Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-47 25.6 Example: altio/epoll_input.c./ epoll_input file... Monitors one or more files using epoll API to see if input is possible Suitable files to give as arguments are: FIFOs Terminal device names (May need to run sleep command in FG on the other terminal, to prevent shell stealing input) Standard input /dev/stdin Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-48 25.6

Example: altio/epoll_input.c (1) 1 # define MAX_BUF 1000 /* Max. bytes for read () */ 2 # define MAX_EVENTS 5 3 /* Max. number of events to be returned from 4 a single epoll_wait () call */ 5 6 int epfd, ready, fd, s, j, numopenfds ; 7 struct epoll_event ev; 8 struct epoll_event evlist[ MAX_EVENTS ]; 9 char buf [ MAX_BUF ]; 10 11 epfd = epoll_create(argc - 1); Declarations for various variables Create an epoll instance, obtaining epoll FD Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-49 25.6 Example: altio/epoll_input.c (2) 1 for ( j = 1; j < argc ; j ++) { 2 fd = open( argv [j], O_RDONLY ); 3 printf (" Opened \"% s\" on fd %d\n", argv [j ], fd ); 4 5 ev. events = EPOLLIN; 6 ev.data.fd = fd; 7 epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev); 8 } 9 10 numopenfds = argc - 1; Open each of the files named on command line Each file is monitored for input (EPOLLIN) fd placed in ev.data, so it is returned by epoll_wait() Add the FD to epoll interest list (epoll_ctl()) Track the number of open FDs Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-50 25.6

Example: altio/epoll_input.c (3) 1 while (numopenfds > 0) { 2 printf (" About to epoll_wait ()\ n"); 3 ready = epoll_wait( epfd, evlist, MAX_EVENTS, -1); 4 if ( ready == -1) { 5 if ( errno == EINTR ) 6 continue ; /* Restart if interrupted 7 by signal */ 8 else 9 errexit (" epoll_wait "); 10 } 11 printf (" Ready : %d\n", ready ); Loop, fetching epoll events and analyzing results Loop terminates when all FDs has been closed epoll_wait() call places up to MAX_EVENTS events in evlist timeout == -1 infinite timeout Return value of epoll_wait() is number of ready FDs Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-51 25.6 Example: altio/epoll_input.c (4) 1 for ( j = 0; j < ready; j ++) { 2 printf (" fd =%d; events : %s%s%s\n", evlist [j]. data.fd, 3 ( evlist [j]. events & EPOLLIN )? " EPOLLIN " : "", 4 ( evlist [j]. events & EPOLLHUP )? " EPOLLHUP " : "", 5 ( evlist [j]. events & EPOLLERR )? " EPOLLERR " : ""); 6 if ( evlist [j]. events & EPOLLIN) { 7 s = read( evlist [j]. data.fd, buf, MAX_BUF ); 8 printf (" read %d bytes : %.* s\n", s, s, buf ); 9 } else if ( evlist [ j]. events & ( EPOLLHUP EPOLLERR )) { 10 printf (" closing fd %d\n", evlist [j]. data.fd ); 11 close(evlist[j].data.fd); 12 numopenfds--; 13 } 14 } 15 } Scan up to ready items in evlist Display events bits If EPOLLIN event occurred read some input and display it on stdout Otherwise, if error or hangup, close FD and decrements FD count Code correctly handles case where both EPOLLIN and EPOLLHUP are set in evlist[j].events Linux/UNIX System Programming c 2015, Michael Kerrisk Alternative I/O Models 25-52 25.6