Parallel I/O Paradigms. Files and File Semantics. Non-Blocking I/O. Enabling Parallel Operations. Multi-Channel Poll/Select. Worker Threads 3/28/2017

Similar documents
File Systems: Allocation Issues, Naming, and Performance CS 111. Operating Systems Peter Reiher

CISC2200 Threads Spring 2015

CS 111. Operating Systems Peter Reiher

The POSIX AIO interface consists of the following functions: Enqueue a write request. This is the asynchronous analog of write(2).

Real-Time Systems. Real-Time Operating Systems

NPRG051 Advanced C++ Programming 2016/17 Assignment 2

Real-Time Systems Hermann Härtig Real-Time Operating Systems Brief Overview

ECE 650 Systems Programming & Engineering. Spring 2018

ISA 563: Fundamentals of Systems Programming

33. Event-based Concurrency

UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING SOFTWARE ENGINEERING DEPARTMENT

12: Filesystems: The User View

Why files? 1. Storing a large amount of data 2. Long-term data retention 3. Access to the various processes in parallel

I/O. CS 416: Operating Systems Design Department of Computer Science Rutgers University

Implementation of a Single Threaded User Level Asynchronous I/O Library using the Reactor Pattern

REAL-TIME OPERATING SYSTEMS SHORT OVERVIEW

Asynchronous Events on Linux

OS COMPONENTS OVERVIEW OF UNIX FILE I/O. CS124 Operating Systems Fall , Lecture 2

File Systems Overview. Jin-Soo Kim ( Computer Systems Laboratory Sungkyunkwan University

I/O Models. Kartik Gopalan

Overview. POSIX signals. Generation of signals. Handling signals. Some predefined signals. Real-time Systems D0003E 3/3/2009

we are here Page 1 Recall: How do we Hide I/O Latency? I/O & Storage Layers Recall: C Low level I/O

Fall 2017 :: CSE 306. File Systems Basics. Nima Honarmand

File Systems. What do we need to know?

FILE SYSTEMS. Jo, Heeseung

File Systems: Consistency Issues

CS 111. Operating Systems Peter Reiher

CSE 333 SECTION 3. POSIX I/O Functions

Announcement. Exercise #2 will be out today. Due date is next Monday

File Systems. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Motivation. Operating Systems. File Systems. Outline. Files: The User s Point of View. File System Concepts. Solution? Files!

CSCE 313 Introduction to Computer Systems. Instructor: Dezhen Song

Input / Output. Kevin Webb Swarthmore College April 12, 2018

Today: VM wrap-up Select (if time) Course wrap-up Final Evaluations

Operating Systems CS 217. Provides each process with a virtual machine. Promises each program the illusion of having whole machine to itself

PROCESSES AND THREADS THREADING MODELS. CS124 Operating Systems Winter , Lecture 8

Input/Output Systems

CSE 120 PRACTICE FINAL EXAM, WINTER 2013

Outline. OS Interface to Devices. System Input/Output. CSCI 4061 Introduction to Operating Systems. System I/O and Files. Instructor: Abhishek Chandra

Operating System: Chap13 I/O Systems. National Tsing-Hua University 2016, Fall Semester

File System CS170 Discussion Week 9. *Some slides taken from TextBook Author s Presentation

Assignment 2 Group 5 Simon Gerber Systems Group Dept. Computer Science ETH Zurich - Switzerland

Contents. Error Message Descriptions... 7

Basic OS Progamming Abstrac7ons

December 8, epoll: asynchronous I/O on Linux. Pierre-Marie de Rodat. Synchronous/asynch. epoll vs select and poll

Distributed File Systems

I/O Handling. ECE 650 Systems Programming & Engineering Duke University, Spring Based on Operating Systems Concepts, Silberschatz Chapter 13

Diagram of Process State Process Control Block (PCB)

File Management. Information Structure 11/5/2013. Why Programmers Need Files

The Operating System. Chapter 6

ADVANCED OPERATING SYSTEMS

Basic OS Progamming Abstrac2ons

EECS 3221 Operating System Fundamentals

EECS 3221 Operating System Fundamentals

ELEC 377 Operating Systems. Week 1 Class 2

Lecture 15: I/O Devices & Drivers

Operating Systems CMPSCI 377 Spring Mark Corner University of Massachusetts Amherst

RCU. ò Walk through two system calls in some detail. ò Open and read. ò Too much code to cover all FS system calls. ò 3 Cases for a dentry:

we are here I/O & Storage Layers Recall: C Low level I/O Recall: C Low Level Operations CS162 Operating Systems and Systems Programming Lecture 18

Operating Systems. Operating Systems Professor Sina Meraji U of T

VFS, Continued. Don Porter CSE 506

File Systems. Todays Plan. Vera Goebel Thomas Plagemann. Department of Informatics University of Oslo

UNIX File System. UNIX File System. The UNIX file system has a hierarchical tree structure with the top in root.

CPSC/ECE 3220 Fall 2017 Exam Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts.

CSE 120 Principles of Operating Systems

Chapter 3 Processes. Process Concept. Process Concept. Process Concept (Cont.) Process Concept (Cont.) Process Concept (Cont.)

CS370 Operating Systems

SOFTWARE ARCHITECTURE 2. FILE SYSTEM. Tatsuya Hagino lecture URL.

Page 1. Analogy: Problems: Operating Systems Lecture 7. Operating Systems Lecture 7

COMP/ELEC 429/556 Introduction to Computer Networks

Exercise (could be a quiz) Solution. Concurrent Programming. Roadmap. Tevfik Koşar. CSE 421/521 - Operating Systems Fall Lecture - IV Threads

File. File System Implementation. File Metadata. File System Implementation. Direct Memory Access Cont. Hardware background: Direct Memory Access

Real-Time Operating Systems

1 / 23. CS 137: File Systems. General Filesystem Design

Lecture 7: Signals and Events. CSC 469H1F Fall 2006 Angela Demke Brown

Processes. Johan Montelius KTH

Operating Systems 2010/2011

A process. the stack

File Systems. Before We Begin. So Far, We Have Considered. Motivation for File Systems. CSE 120: Principles of Operating Systems.

Section 3: File I/O, JSON, Generics. Meghan Cowan

File. File System Implementation. Operations. Permissions and Data Layout. Storing and Accessing File Data. Opening a File

CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University

File Systems. ECE 650 Systems Programming & Engineering Duke University, Spring 2018

IPC, Threads, Races, Critical Sections. Operating Systems Principles. IPC, Threads, Races, Critical Sections. What is a thread?

File Management. Chapter 12

Lecture 8: Other IPC Mechanisms. CSC 469H1F Fall 2006 Angela Demke Brown

Sistemi in Tempo Reale

Inter-Process Communication. IPC, Threads, Races, Critical Sections

Topics. Lecture 8: Other IPC Mechanisms. Socket IPC. Unix Communication

CSE 120: Principles of Operating Systems. Lecture 10. File Systems. February 22, Prof. Joe Pasquale

CLOUD-SCALE FILE SYSTEMS

Device-Functionality Progression

Chapter 12: I/O Systems. I/O Hardware

Group-A Assignment No. 6

The Big Picture So Far. Chapter 4: Processes

Motivation: I/O is Important. CS 537 Lecture 12 File System Interface. File systems

What is a file system

I/O Systems. Jo, Heeseung

CS 5460/6460 Operating Systems

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

Transcription:

10I. Polled and Non-Blocking I/O 10J. User-Mode Asynchronous I/O 11A. File Semantics 11B. Namespace Semantics Parallel I/O Paradigms Busy, but periodic checking just in case new input might cause a change of plans Multiple semi-independent streams each requiring relatively simple processing Multiple semi-independent operations each requiring multiple, potentially blocking steps Many balls in the air at all times numerous parallel requests from many clients keeping I/O queues full to improve throughput 1 2 Enabling Parallel Operations Threads are an obvious solution one thread per-stream or per-request streams or requests are handled in parallel when one thread blocks, others can continue There are other parallel I/O mechanisms non-blocking I/O multi-channel poll/select operations asynchronous I/O Non-Blocking I/O check to see if data/room is available but do not block to wait for it this enables parallelism, prevents deadlocks a file can be opened in a non-blocking mode open(name, flags O_NONBLOCK) fcntl(fd, F_SETFL, flags O_NONBLOCK) if data is available, read(2)will return it otherwise it fails with EWOULDBLOCK can also be used with write(2)and open(2) 3 4 Multi-Channel Poll/Select there are multiple possible input sources parallel streams (e.g. ssh input/output) multiple request generating clients poll(2)/select(2) wait for first interesting event a list of file descriptors to be checked a list of interesting events (input, output, error) a maximum time to wait a signal mask (to use while waiting) do read/write on indicated file descriptor(s) Worker Threads Consider a web or remote file system server it receives thousands of requests/second each requires multiple (blocking) operations create a thread to serve each new request Thread creation is relatively expensive continuous creation/destruction seems wasteful solution: recycle the worker threads thread blocks when its operation finishes it is awakened when a new operation needs servicing we still have switching and synchronization 5 6 1

NBIO vs. Poll/Select vs. Threads NBIO very simple occasional checks for unlikely input cost of wasted spins is not a concern Poll/Select efficient multi-stream processing multiple sources of interesting input/event wait for the first available, serve one at a time Parallel Threads for complex operations all can operations proceed in parallel, not just I/O blocking operation does not block other threads None are practical for massive parallelism 7 Asynchronous I/O Huge numbers of parallel I/O operations many parallel clients w/many parallel requests deep I/O queues to improve throughput make sure completions processed correctly thread per operation is too expensive we want to queue many parallel operations receive asynchronous completion notifications OS has always handled high traffic I/O this way increasingly many applications now do as well 8 Scheduling Asynchronous I/O int aio_read( struct aiocb*) struct aiocb{ int aio_filedes; // file descriptor off_t aio_offset; // file offset void *aio_buf; // local buffer int aio_nbytes; // byte count int aio_reqprio; // request priority sigevent aio_sigevent; // notification method if successful, operation has been queued it will complete at some time in the future a very large number of ops can be outstanding 9 Asynchronous Completion we can poll the status of any operation int aio_error( struct aiocb* ) int aio_return( struct aiocb*) returns 0, EINPROGRESS, or completion error we can await completion of some opeations int aio_suspend( struct aiocb*, items, timeout) returns when one or more complete (or timeout) we can cancel or force any operations int aio_cancel( struct aiocb* ) int aio_fsync( fd) 10 Completion Notifications struct sigevent{ int sigev_notify; // by signal or thread int sigev_signo; // notification signal # int sigev_value; // param to signal handler void (*handler)(int);// handler to invoke user-mode analog of completion interrupts completion generates a specified signal completion creates a specified thread 11 Signals for Event Notifications Signals were originally designed for exceptions infrequent events, most often fatal multiple race conditions in handling/disabling bad semantics were good enough Now they are used for event notifications continuous events in normal operation loss of even a single event is unacceptable they need to be safe and reliable We have long known how to do this properly make them more like h/w interrupts 12 2

sigaction(2) int sigaction(int signum, sigaction*new, sigaction*old) struct sigaction{ void (*handler)(int); // handler void (*action)(int, siginfo);// handler sigset_t mask; // signals to block int flags; // handling options mask eliminates reentrancy races siginfopasses much info about cause of signal sigreturn(2) controls return from handler Asynchronous I/O: Back to the Future OS I/O always asynchronous, interrupt driven necessary to achieve throughput and efficiency apps were given comforting synchronous illusion until they needed major throughput and efficiency simpler, more s/w-like mechanisms were tried they were much less efficient they proved race-prone under heavy use h/w interrupt model is refined, well proven if there was a simpler way, we would be using it the same model works well for s/w too 13 14 What is a File a file is a named collection of information primary roles of file system: to store and retrieve data to manage the media/space where data is stored typical operations: where is the first block of this file where is the next block of this file where is block 35 of this file allocate a new block to the end of this file free all blocks associated with this file Data and Metadata File systems deal with two kinds of information Data the contents of the file e.g. instructions of the program, words in the letter Metadata Information about the file e.g. how many bytes are there, when was it created sometimes called attributes both must be persisted and protected stored and connected by the file system 15 16 Sequential Byte Stream Access int infd = open( abc, O_RDONLY); int outfd = open( xyz, O_WRONLY+O_CREATE, 0666); if (infd>= 0 && outfd>= 0) { int count = read(infd, buf, sizeof buf); while( count > 0 ) { write(outfd, buf, count); count = read(infd, inbuf, BUFSIZE); close(infd); close(outfd); Random Access void *readsection(int fd, struct hdr *index, int section) { struct hdr *head = &hdr[section]; off_t offset = head->section_offset; size_t len = head->section_length; void *buf = malloc(len); if (buf!= NULL) { lseek(fd, offset, SEEK_SET); if (read(fd, buf, len) <= 0) { free(buf); buf= NULL; return(buf); 17 18 3

Consistency Model When do new readers see results of a write? read-after-write as soon as possible, data-base semantics this commonly called POSIX consistency read-after-close (or sync/commit) only after writes are committed to storage open-after-close (or sync/commit) each open sees a consistent snapshot explicitly versioned files each open sees a named, consistent snapshot File Attributes basic properties thus far we have focused on a simple model a file is a "named collection of data blocks" in most OS files have more state than this file type (regular file, directory, device, IPC port,...) file length (may be excess space at end of last block)) ownership and protection information system attributes (e.g. hidden, archive) creation time, modification time, last accessed time typically stored in file descriptor structure 19 20 Extended File Types and Attributes extended protection information e.g. access control lists resource forks e.g. configuration data, fonts, related objects application defined types e.g. load modules, HTML, e-mail, MPEG,... application defined properties e.g. compression scheme, encryption algorithm,... Databases a tool managing business critical data table is equivalent of a file system data organized in rows and columns row indexed by unique key columns are named fields within each row support a rich set of operations multi-object, read/modify/write transactions SQL searches return consistent snapshots insert/delete row/column operations 21 22 Object Stores simplified file systems, cloud storage optimized for large but infrequent transfers bucketis equivalent of a file system a bucket contains named, versioned objects objectshave long names in a flat name space objectnames are unique within a bucket an objectis a blob of immutable bytes get all or part of the object put new version, there is no append/update delete Key-Value Stores smaller and faster than an SQL database optimized for frequent small transfers tableis equivalent of a file system a tableis a collection of key/valuepairs keyshave long names in a flat name space keynames are unique within a table value is a (typically 64-64MB) string get/put (entire value) delete 23 24 4

File Names and Name Binding file system knows files by internal descriptors users know files by names names more easily remembered than disk addresses names can be structured to organize millions of files file system responsible for name-to-file mapping associating names with new files changing names associated with existing files allowing users to search the name space there are many ways to structure a name space What is in a Name? directory /home/mark/todo.txt separator base name suffix suffixes and file types file-to-application binding often based on suffix defined by system configuration registry configured per user, or per directory suffix may define the file type (e.g. Windows) suffix may only be a hint (magic # defines type) 25 26 Flat Name Spaces there is one naming context per file system all file names must be unique within that context all files have exactly one true name these names are probably very long file names may have some structure e.g. CAC101.CS111.SECTION1.SLIDES.LECTURE_13 this structure may be used to optimize searches the structure is very useful to users the structure has no meaning to the file system Hierarchical Namespaces directory a file containing references to other files it can be used as a naming context each process has a current working directory names are interpreted relative to directory nested directories can form a tree file name is a path through that tree directory tree expands from a rootnode fully qualified names begin from the root may actually form a directed graph 27 28 A rooted directory tree Hard Links: example root user_1 user_2 user_3 user_1 root user_3 Note that we now associate names with links rather than with files. file_a dir_a (/user_1/file_a) (/user_1/dir_a) (/user_2/) file_c (/user_3/file_c) dir_a (/user_3/dir_a) /user_1/file_a and /user_3/dir_a/ file_a dir_a file_c file_a (/user_1/dir_a/file_a) (/user_3/dir_a/) are both links to the same I-node 29 30 5

Unix-style Hard Links Symbolic Links: example all protection information is stored in the file file owner sets file protection (e.g. read-only) all links provide the same access to the file anyone with read access to file can create new link but directories are protected files too not everyone has read or search access to every directory all links are equal there is nothing special about the owner s link file is not deleted until no links remain to file reference count keeps track of references file_a user_1 root user_3 dir_a (/user_1/file_a) file_c 31 32 Symbolic Links Generalized Directories: Issues another type of special file an indirect reference to some other file contents is a path name to another file Operating System recognizes symbolic links automatically opens associated file instead if file is inaccessible or non-existent, the open fails symbolic link is not a reference to the I-node symbolic links will not prevent deletion do not guarantee ability to follow the specified path Internet URLs are similar to symbolic links Can there be multiple links to a single file? who can create new references to a file? if one reference is deleted, does the file go away? can the file's owner cancel other peoples' access? Can clients work their way back up the tree? Is the namespace truly a tree or is it an a-cyclic directed graph or an arbitrary directed graph Does namespace span multiple file systems? 33 34 select(2) Supplementary Slides int pselect( int nfds, fd_set*readfds, fd_set *writefds, fd_set*exceptfds, struct timeval *timeout, sigset_t*sigmask) fd_set is a bit-map of interesting file descriptors returns when event, timeout, or signal parameters updated to reflect what happened Created in 4.2BSD (1983) older, more widely adopted 35 36 6

poll(2) int ppoll( stuct pollfd*fds, int nfds, struct timespec *, sigset_t*) struct pollfd{ int fd; short events; // requested events short revents;// returned events returns when event, timeout, or signal revents reflect what happened Created in UNIX SVR3 (1986) newer, perhaps a little better thought out 37 Indexed Sequential Files record structured files for database like use records may be fixed or variable length all records have one or more keys records can be accessed by their key sample key: student ID number new file update operations insert record, delete record performance challenges efficient insertion, key searches 38 What is an Index a table of contents for another file lists keys, and pointers to associated records built by examining the real data file enables much faster access to desired records search for records in index rather than file index is much shorter, and sorted by key(s) index is sometimes called an inverted file files are accessed by record location, to find the data index is accessed by data key, to find record location Name Space Structure how many files can have the same name? one per file system... flat name spaces one per directory... hierarchical name spaces how many different names can one file have? a single true name only one true name, but aliases are allowed arbitrarily many do different names have different privileges? does deleting one name make others disappear too? do all names see the same access permissions? 39 40 7