Memory Mapped I/O. Michael Jantz. Prasad Kulkarni. EECS 678 Memory Mapped I/O Lab 1

Similar documents
File System (FS) Highlights

Hyo-bong Son Computer Systems Laboratory Sungkyunkwan University

Design Choices 2 / 29

Important Dates. October 27 th Homework 2 Due. October 29 th Midterm

I/O OPERATIONS. UNIX Programming 2014 Fall by Euiseong Seo

Operating System Labs. Yuanbin Wu

The course that gives CMU its Zip! I/O Nov 15, 2001

Advanced Systems Security: Ordinary Operating Systems

I/O OPERATIONS. UNIX Programming 2014 Fall by Euiseong Seo

File I/O. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File I/O. Dong-kun Shin Embedded Software Laboratory Sungkyunkwan University Embedded Software Lab.

17: Filesystem Examples: CD-ROM, MS-DOS, Unix

UNIX System Calls. Sys Calls versus Library Func

Contents. Programming Assignment 0 review & NOTICE. File IO & File IO exercise. What will be next project?

Contents. NOTICE & Programming Assignment #1. QnA about last exercise. File IO exercise

CS , Spring Sample Exam 3

Lecture 23: System-Level I/O

structs as arguments

Files and Directories Filesystems from a user s perspective

File Systems. q Files and directories q Sharing and protection q File & directory implementation

Linux Forensics. Newbug Tseng Oct

File Systems. Today. Next. Files and directories File & directory implementation Sharing and protection. File system management & examples

All the scoring jobs will be done by script

Chapter 4 - Files and Directories. Information about files and directories Management of files and directories

Contents. NOTICE & Programming Assignment 0 review. What will be next project? File IO & File IO exercise

Operating System Labs. Yuanbin Wu

CS 201. Files and I/O. Gerson Robboy Portland State University

Memento: Time Travel for the Web

All the scoring jobs will be done by script

Files. Eric McCreath

Automated Test Generation in System-Level

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

OPERATING SYSTEMS: Lesson 2: Operating System Services

File Types in Unix. Regular files which include text files (formatted) and binary (unformatted)

System- Level I/O. Andrew Case. Slides adapted from Jinyang Li, Randy Bryant and Dave O Hallaron

CS631 - Advanced Programming in the UNIX Environment

System Calls. Library Functions Vs. System Calls. Library Functions Vs. System Calls

CSC 271 Software I: Utilities and Internals

39. File and Directories

HEC POSIX I/O API Extensions Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

System-Level I/O. Topics Unix I/O Robust reading and writing Reading file metadata Sharing files I/O redirection Standard I/O

Thesis, antithesis, synthesis

ELEC-C7310 Sovellusohjelmointi Lecture 3: Filesystem

which maintain a name to inode mapping which is convenient for people to use. All le objects are

Lab 09 - Virtual Memory

System-Level I/O Nov 14, 2002

Master Calcul Scientifique - Mise à niveau en Informatique Written exam : 3 hours

A Typical Hardware System The course that gives CMU its Zip! System-Level I/O Nov 14, 2002

POSIX Shared Memory. Linux/UNIX IPC Programming. Outline. Michael Kerrisk, man7.org c 2017 November 2017

ECE 650 Systems Programming & Engineering. Spring 2018

Files and Directories Filesystems from a user s perspective

CMPS 105 Systems Programming. Prof. Darrell Long E2.371

Systems Programming/ C and UNIX

Lecture files in /home/hwang/cs375/lecture05 on csserver.

File Systems: Consistency Issues

Operating Systems 2010/2011

FILE SYSTEMS. Jo, Heeseung

Applications of. Virtual Memory in. OS Design

Operating System Labs. Yuanbin Wu

CSCI 4061: Virtual Memory

Operating systems. Lecture 7

Introduction to File Systems. CSE 120 Winter 2001

The bigger picture. File systems. User space operations. What s a file. A file system is the user space implementation of persistent storage.

System-Level I/O. William J. Taffe Plymouth State University. Using the Slides of Randall E. Bryant Carnegie Mellon University

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 23

CSE 333 SECTION 3. POSIX I/O Functions

File Input and Output (I/O)

Outline. OS Interface to Devices. System Input/Output. CSCI 4061 Introduction to Operating Systems. System I/O and Files. Instructor: Abhishek Chandra

Shared Memory Memory mapped files

CSci 4061 Introduction to Operating Systems. File Systems: Basics

ECEN 449 Microprocessor System Design. Review of C Programming. Texas A&M University

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 23

System Calls and I/O Appendix. Copyright : University of Illinois CS 241 Staff 1

we are here Page 1 Recall: How do we Hide I/O Latency? I/O & Storage Layers Recall: C Low level I/O

OS COMPONENTS OVERVIEW OF UNIX FILE I/O. CS124 Operating Systems Fall , Lecture 2

Memory Management. Fundamentally two related, but distinct, issues. Management of logical address space resource

we are here I/O & Storage Layers Recall: C Low level I/O Recall: C Low Level Operations CS162 Operating Systems and Systems Programming Lecture 18

CSCE 548 Building Secure Software Dirty COW Race Condition Attack

File Systems Overview. Jin-Soo Kim ( Computer Systems Laboratory Sungkyunkwan University

Preview. lseek System Calls. A File Copy 9/18/2017. An opened file offset can be explicitly positioned by calling lseek system call.

System-Level I/O March 31, 2005

SYSTEM CALL IMPLEMENTATION. CS124 Operating Systems Fall , Lecture 14

File Management 1/34

File Systems. CS 450: Operating Systems Sean Wallace Computer Science. Science

Introduction to PThreads and Basic Synchronization

ECEN 449 Microprocessor System Design. Review of C Programming

Files and Directories

Goals of this Lecture

Introduction to Linux, for Embedded Engineers Tutorial on Virtual Memory. Feb. 22, 2007 Tetsuyuki Kobayashi Aplix Corporation. [translated by ikoma]

I/O Management! Goals of this Lecture!

I/O Management! Goals of this Lecture!

Virtual Memory: Systems

Virtual Memory. Alan L. Cox Some slides adapted from CMU slides

Lecture 21 Systems Programming in C

Princeton University. Computer Science 217: Introduction to Programming Systems. Dynamic Memory Management

Virtual Memory In Nachos

Preview. System Call. System Call. System Call. System Call. Library Functions 9/20/2018. System Call

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

Fall 2017 :: CSE 306. File Systems Basics. Nima Honarmand

Original ACL related man pages

Transcription:

Memory Mapped I/O Michael Jantz Prasad Kulkarni EECS 678 Memory Mapped I/O Lab 1

Introduction This lab discusses various techniques user level programmers can use to control how their process' logical address space can be manipulated to map to physical memory and elements of the file system in Linux. It is not directly related to the third project assignment, but it should help firm up your understanding of virtual memory in that it will use the virtual memory subsystem to do file I/O Our tar file this week is slightly bigger than normal. Go ahead and make the starter code for this lab: bash> cd eecs678-mmio-lab/; make EECS 678 Memory Mapped I/O Lab 2

Logical Address Space The loading and storing of data from disk and into memory is essential to almost all user programs However, memory management can be complicated as programmers have to manage concerns over concurrent access to a shared resource and memory storage limitations For these reasons, many operating systems abstract main memory into an extremely large, uniform array of storage, separating logical memory as viewed by each user process from physical memory Today, we will look at some of the system calls Linux provides to give application programmers simple and efficient access to their process' logical address spaces. EECS 678 Memory Mapped I/O Lab 3

read_write This program copies a file using the read and write system calls we've already seen. It takes three command line arguments: Source File to copy Destination file to copy to Buffer size per read and write Recall read and write use a fixed size buffer to store the data read from memory and to refer to before writing back to memory This program uses malloc to allocate memory for this buffer. When a process performs a malloc(n) the operating system increases its logical address space by n bytes and returns a pointer to the first byte in the (linear) address space EECS 678 Memory Mapped I/O Lab 4

Running read_write To run read_write on the provided sample.ogg video with a buffer of size 5 bytes, type: -bash-3.2$./read_write sample.ogg copy.ogg 5 This may take some time. You should verify the generated copy.ogg is the same as the video sample.ogg. Run: -bash-3.2$ totem copy.ogg sample.ogg is only a 12MB file. Copies of this size do not normally take such a long time. The reason the operation took so long is that we only allowed our process to allocate 5 additional bytes of memory to perform the copy. For every iteration of the read / write loop in the program, only 5 bytes was written to memory. EECS 678 Memory Mapped I/O Lab 5

A Faster read_write We can tweak the command line argument to see how the system reacts to allowing larger buffer sizes. We can also use the Unix 'time' command to time how this process performs differently when used with a different sized read / write buffer. time reports the total (real) time needed to execute the program, the time spent executing in user mode, and the time spent executing in system mode. It's resolution is milliseconds: -bash-3.2$ time./read_write sample.ogg copy.ogg 10 real user sys 0m4.417s 0m0.383s 0m2.949s After a few buffer size increases, you should see the speed of the copy no longer increasing. This is because, as you increase the buffer size, the main copy loop does not have to loop as often to copy the file. Eventually, however, the overhead of the read and write system calls is no longer the copy operation's bottleneck. No matter how it is done, 12MB of file data will have to be read into memory, from disk, to copy the file. A single disk read is orders of magnitude slower than a system call. Data must also be written to the new file. Disk I/O eventually becomes the bottleneck. EECS 678 Memory Mapped I/O Lab 6

Segmentation Faults Explained Now, let's switch to the memmap program. If you run the program in its current state, you will see: -bash-3.2$./memmap sample.ogg copy.ogg Segmentation fault Why is this happening? The program issues the instruction: *dst = *src; dst and src are pointers initialized by their declaration to point to some location in memory. The space for the dst and src pointers is that for local variables on the stack. However, the space pointed to by dst and src has not been allocated. In fact, they both point to NULL, or 0x0, a logical address which is designated as illegal on most systems. When a user process attempts to dereference a logical address (in C this is done by applying the * operator to a pointer to a logical address) that is not in its address space, the kernel sends the process a SIGSEGV signal to indicate it has performed an operation resulting in a segmentation fault. EECS 678 Memory Mapped I/O Lab 7

Copying in Memory In this lab, we will copy the sample.ogg file using only direct memory operations. We will dereference a pointer to every byte in the source file and copy the data into space reserved for the destination file. To avoid segmentation faults, we could create a memory buffer area for the whole file using malloc, perform a small set of large or a large set of small read operations to get the data into the buffer, then write this buffer to a file. However, Linux provides a way of creating a direct byte-to-byte mapping of a file directly into your user process' logical address space. Copying data from file to file can then be done copying it from one region of a programs logical address space to another. The OS takes care of updating file contents on disk This approach is called memory mapping and is done using the mmap system call. EECS 678 Memory Mapped I/O Lab 8

mmap The mmap system call is described below: void *mmap(void *start, size_t length, int prot, int flags, int fd, off_t offset); start The address at which to create the mapping. If the value passed in is NULL, the kernel will choose the address for you. prot The desired memory protection of the mapping flags Determines whether updates to the mapping are visible to other processes mapping the same region, and whether updates are carried through to the underlying file. The contents of the shared segment are initialized using length bytes starting at offset in the file referred to by the file descriptor fd. On success, this call returns a pointer to the memory mapped area. In the case of an error, it returns -1. EECS 678 Memory Mapped I/O Lab 9

Preparing for the mmap There are a few preparations we need to make before mapping both the source and destination files into regions of logical memory Most obviously, we see that mmap takes a size argument. We should find the size of the input file before calling mmap to create a memory image of its contents We can use the fstat system call to obtain the size of an open file: int fstat(int fd, struct stat *buf); This call stores several statistics about the open file corresponding to fd into the stat structure pointed to by buf. The stat structure is on the following slide. For our purposes, we will use the st_size field in this structure as the length of the file we wish to copy. For error checking purposes, you should capture the return value of this function, which is 0 on success and -1 on failure, and call err_sys with an appropriate error string if the fstat fails EECS 678 Memory Mapped I/O Lab 10

struct stat struct stat { dev_t st_dev; /* ID of device containing file */ ino_t st_ino; /* inode number */ mode_t st_mode; /* protection */ nlink_t st_nlink; /* number of hard links */ uid_t st_uid; /* user ID of owner */ gid_t st_gid; /* group ID of owner */ dev_t st_rdev; /* device ID (if special file) */ off_t st_size; /* total size, in bytes */ blksize_t st_blksize; /* blocksize for filesystem I/O */ blkcnt_t st_blocks; /* number of blocks allocated */ time_t st_atime; /* time of last access */ time_t st_mtime; /* time of last modification */ time_t st_ctime; /* time of last status change */ }; EECS 678 Memory Mapped I/O Lab 11

Allocating Destination File Space Less obviously, notice, that we have an open file descriptor to the destination file, but, as of yet, this file descriptor does not correspond to any file contents in blocks on the physical disk. If we map the destination file into memory before writing to that file, we create a situation where we have allocated physical memory that does not correspond to file contents of the destination file in allocated space on the physical disk. In the current state, if the program writes to this mapped memory, the operating system does not know where to store this data when the memory image is written out to the destination file The result is an error which is sent to our process in the form of a SIGBUS signal. EECS 678 Memory Mapped I/O Lab 12

Allocating Destination File Space We can get around this by using file operations to force the allocation of space for the file on the file system. We will do this by, first, seeking to the last possible byte we may want to write for our copy operation, using the source file size we obtained earlier, and writing a dummy byte to this location in the file. In effect, writing the last byte requires that all previous bytes exist. The lseek system call takes a file descriptor and moves its cursor to the position specified by its arguments: off_t lseek(int fd, off_t offset, int whence) fd is the file descriptor we wish to seek on, and offset is how far to seek in according to the directive whence. You may use the (size-of-the-source-file 1) for offset and the SEEK_SET macro to simply set the cursor to an offset in the destination file corresponding to the size of the source file. Once its cursor has been set, you can simply write a dummy byte to the file using the write system call. This forces the operating system to allocate disk space for the entire file. Remember to check for error conditions on all your system calls lseek and write both return -1 on error EECS 678 Memory Mapped I/O Lab 13

Doing the mmap Now, we can go ahead and perform the mmap for both our source and destination files. One call to mmap will use the fdin FD to map the source file, the other will use the fdout FD to map the destination file. Refer to slide 9 for the type and order of mmap arguments For both mmap calls we can use the following arguments: NULL for the start address to tell the OS to choose the address at which to create the mapping. MAP_SHARED for the flags argument to tell the OS to carry the changes to the mapping through to the underlying file. 0 for the offset to tell the OS we want the mapping to start at the start of the file. Now, the file mapped using fdin should be mapped with only read access privilege (PROT_READ), and the file mapped using fdout should be mapped with both read and write access privileges (PROT_READ PROT_WRITE). On success, this call returns a pointer to the memory mapped area. As with the other system calls, mmap returns -1 on error. EECS 678 Memory Mapped I/O Lab 14

Copying the File For this part of the lab, we leave it to you to determine how you wish to copy the file now that you have memory mapped both the source and destination files in your logical address space One possibility is to start at the pointer returned by mmap and copy each byte of source memory into the destination memory one at a time in a loop using pointer arithmetic Another is to use the memcpy system call. See the memcpy man page for information on this system call. EECS 678 Memory Mapped I/O Lab 15

Testing When you are through, you should test your implementation again with the 'time' command. Also make sure your copied video plays as well You should note that this implementation is faster than the read_write program with a small buffer, but about the same as read_write with a larger buffer. EECS 678 Memory Mapped I/O Lab 16

Conclusion Theoretically, the mmap solution should outperform the read / write solution: The read / write solution copies the contents of the source file (stored in a kernel-space buffer of memory) to a user-space buffer before copying these contents back to a kernel-space buffer corresponding to the destination file. The mmap solution avoids this additional copy by directly mapping the kernel-space buffers allocated for the source and destination files directly to our user-level process' virtual address space. However, both methods require that data on the physical disk first be pulled into physical memory before it can be copied, and that the contents of the file must be present in physical memory before it can be written out to disk. The overhead of performing these disk read/write operations is the main performance bottleneck in many situations, including this one. EECS 678 Memory Mapped I/O Lab 17