INTRODUCTION TO THE UNIX FILE SYSTEM 1)

Similar documents
Files and Directories Filesystems from a user s perspective

Last Week: ! Efficiency read/write. ! The File. ! File pointer. ! File control/access. This Week: ! How to program with directories

Files and Directories

Fall 2017 :: CSE 306. File Systems Basics. Nima Honarmand

The bigger picture. File systems. User space operations. What s a file. A file system is the user space implementation of persistent storage.

Overview. Unix System Programming. Outline. Directory Implementation. Directory Implementation. Directory Structure. Directories & Continuation

Virtual File System. Don Porter CSE 306

Outline. File Systems. File System Structure. CSCI 4061 Introduction to Operating Systems

The link() System Call. Preview. The link() System Call. The link() System Call. The unlink() System Call 9/25/2017

File and Directories. Advanced Programming in the UNIX Environment

Files and File Systems

File Systems 1. File Systems

File Systems 1. File Systems

Virtual File System. Don Porter CSE 506

Chapter Two. Lesson A. Objectives. Exploring the UNIX File System and File Security. Understanding Files and Directories

Unix Filesystem. January 26 th, 2004 Class Meeting 2

CMPS 105 Systems Programming. Prof. Darrell Long E2.371

we are here Page 1 Recall: How do we Hide I/O Latency? I/O & Storage Layers Recall: C Low level I/O

Advanced Unix/Linux System Program. Instructor: William W.Y. Hsu

Operating System Labs. Yuanbin Wu

Logical disks. Bach 2.2.1

we are here I/O & Storage Layers Recall: C Low level I/O Recall: C Low Level Operations CS162 Operating Systems and Systems Programming Lecture 18

CSI 402 Lecture 11 (Unix Discussion on Files continued) 11 1 / 19

File Systems: Naming

Files and File Systems

Data and File Structures Chapter 2. Basic File Processing Operations

Files and the Filesystems. Linux Files

Physics REU Unix Tutorial

CptS 360 (System Programming) Unit 6: Files and Directories

I/O OPERATIONS. UNIX Programming 2014 Fall by Euiseong Seo

Unix File System. Class Meeting 2. * Notes adapted by Joy Mukherjee from previous work by other members of the CS faculty at Virginia Tech

I/O OPERATIONS. UNIX Programming 2014 Fall by Euiseong Seo

Unix System Architecture, File System, and Shell Commands

Files. Eric McCreath

Chapter 4. File Systems. Part 1

File Systems. What do we need to know?

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Lecture 3. Introduction to Unix Systems Programming: Unix File I/O System Calls

Lecture 24: Filesystems: FAT, FFS, NTFS

Systems Programming. COSC Software Tools. Systems Programming. High-Level vs. Low-Level. High-Level vs. Low-Level.

The UNIX File System

The UNIX File System

Filesystem Hierarchy and Permissions

Filesystem Hierarchy and Permissions

Getting your department account

CS354/CS350 Operating Systems Winter 2004

Lecture files in /home/hwang/cs375/lecture05 on csserver.

UNIX File System. UNIX File System. The UNIX file system has a hierarchical tree structure with the top in root.

Motivation: I/O is Important. CS 537 Lecture 12 File System Interface. File systems

CS140 Operating Systems Final December 12, 2007 OPEN BOOK, OPEN NOTES

Typical File Extensions File Structure

OPERATING SYSTEMS: Lesson 12: Directories

Motivation. Operating Systems. File Systems. Outline. Files: The User s Point of View. File System Concepts. Solution? Files!

read(2) There can be several cases where read returns less than the number of bytes requested:

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission

How do we define pointers? Memory allocation. Syntax. Notes. Pointers to variables. Pointers to structures. Pointers to functions. Notes.

RCU. ò Dozens of supported file systems. ò Independent layer from backing storage. ò And, of course, networked file system support

Files and Directories

CS 1550 Project 3: File Systems Directories Due: Sunday, July 22, 2012, 11:59pm Completed Due: Sunday, July 29, 2012, 11:59pm

CSE 410: Systems Programming

Unix File and I/O. Outline. Storing Information. File Systems. (USP Chapters 4 and 5) Instructor: Dr. Tongping Liu

File Management. Ezio Bartocci.

To understand this, let's build a layered model from the bottom up. Layers include: device driver filesystem file

File System. Minsoo Ryu. Real-Time Computing and Communications Lab. Hanyang University.

CSci 4061 Introduction to Operating Systems. File Systems: Basics

commandname flags arguments

Chp1 Introduction. Introduction. Objective. Logging In. Shell. Briefly describe services provided by various versions of the UNIX operating system.

Operating System Labs. Yuanbin Wu

File Systems. Todays Plan. Vera Goebel Thomas Plagemann. Department of Informatics University of Oslo

File Systems. CS170 Fall 2018

UNIX File Hierarchy: Structure and Commands

The EXT2FS Library. The EXT2FS Library Version 1.37 January by Theodore Ts o

Filesystems Overview

File System. yihshih

CS333 Intro to Operating Systems. Jonathan Walpole

Preview. System Call. System Call. System Call. System Call. Library Functions 9/20/2018. System Call

Summer June 15, 2010

Directory. File. Chunk. Disk

Hyo-bong Son Computer Systems Laboratory Sungkyunkwan University

CS2028 -UNIX INTERNALS

Chapter 6. File Systems

Chapter 4 - Files and Directories. Information about files and directories Management of files and directories

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

5/8/2012. Creating and Changing Directories Chapter 7

Files

FILE SYSTEMS. Tanzir Ahmed CSCE 313 Fall 2018

PESIT Bangalore South Campus

File Systems. Information Server 1. Content. Motivation. Motivation. File Systems. File Systems. Files

Chapter 1 - Introduction. September 8, 2016

CSC 271 Software I: Utilities and Internals

File Systems: Consistency Issues

Operating Systems. Lecture 06. System Calls (Exec, Open, Read, Write) Inter-process Communication in Unix/Linux (PIPE), Use of PIPE on command line

Input & Output 1: File systems

UNIX File Systems. How UNIX Organizes and Accesses Files on Disk

Chapter 7 File Operations

Files & I/O. Today. Comp 104: Operating Systems Concepts. Operating System An Abstract View. Files and Filestore Allocation

Operating systems. Lecture 7

File Management 1/34

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

Today: File System Functionality. File System Abstraction

Transcription:

INTRODUCTION TO THE UNIX FILE SYSTEM 1) 1 FILE SHARING Unix supports the sharing of open files between different processes. We'll examine the data structures used by the kernel for all I/0. Three data structures are used by the kernel, and the relationships among them determines the effect one process has on another with regard to file sharing. Every process has an entry in the process table. Within each process table entry is a table of open file descriptors, which we can think of as a vector, with one entry per descriptor. Associated with each file descriptor are - The file descriptor flags, -Apointertoafiletableentry. The kernel maintains a file table for all open files, Each file table entry contains - The file status flags for the fie (read, write, append, sync, nonblocking, etc.), - The current file offset, - A pointer to the v-node table entry for the file. Each open file (or device) has a v-node structure. The v-node contains information about the type of file and pointers to functions that operate on the file. For most files the v-node also contains the i-node for the file. This information is read from disk when the file is opened, so that all the pertinent information about the file is readily available. For example, the i-node contains the owner of the file, the size of the file, the device the file is located on, pointers to where the actual data blocks for the file are located on disk, and so on. (We talk more about i-nodes in Section 2 when we describe the typical Unix filesystem in more detail.) Figure 1 shows a pictorial arrangement of these three tables for a single process that has two different files open - one file is open on standard input (file descriptor 0) and the other is open on standard output (file descriptor 1). 1) Selectively from "Advanced Programming in the UNIX Environment," written by W. Richard Stevens 1

Figure 1. Kernel Data Structures for Open Files If two independent processes have the same file open we could have the arrangement shown in Figure 2. We assume here that the first process has the file open on descriptor 3, and the second process has that same file open on descriptor 4. Each process that opens the file gets its own file table entry, but only a single v-node table entry is required for a given file. One reason each process gets its own file table entry is so that each process has its own current offset for the file. Figure 2. Two Independent Process with he Same File Open Given these data structures we now need to be more specific about what happens with certain operations that we've already described. 2

After each write is complete, the current file offset in the file table entry is incremented by the number of bytes written. If this causes the current file offset to exceed the current file size, the current file size in the i-node table entry is set to the current file offset (e.g., the file is extended). If a file is opened with the O_APPEND flag, a corresponding flag is set in the file status flags of the file table entry. Each time a write is performed for a file with this append flag set, the current file offset in the file table entry is first set to the current file size from the i-node table entry. This forces every write to be appended to the current end of file. The lseek function only modifies the current file offset in the file table entry, No I/0 takes place, If a file is positioned to its current end of file using lseek, all that happens is the current file offset in the file table entry is set to the current file size from the i-node table entry. It is possible for more than one file descriptor entry to point to the same file table entry. Note the difference in scope between the file descriptor flags and the file status flags. The former apply only to a single descriptor in a single process, while the latter apply to all descriptors in any process that point to the given file table entry. Everything that we've described so far in this section works fine for multiple processes that are reading the same file. Each process has its own file table entry with its own current file offset. 2Filesystems To appreciate the concept of links to a file, we need a conceptual understanding of the structure of the Unix filesystem. Understanding the difference between an i-node and a directory entry that points to an i-node is also useful. We can think of a disk drive being divided into one or more partitions, Each partition can contain a filesystem, as shown in Figure 3. Figure 3. Disk Drive, Partitions, and a Filesystem 3

The i-nodes are fixed-length entries that contain most of the information about the file. If we examine the filesystem in more detail, ignoring the boot blocks and super block, we could have what is shown in Figure 4. Figure 4. Filesystem in More Detail Note the following points from Figure 4. We show two directory entries that point to the same i-node entry, Every i-node has a link count that contains the number of directory entries that point to the i-node. Only when the link count goes to 0 can the file be deleted (i.e., can the data blocks associated with the file be released). This is why the operation of "unlinking a file" does not always mean "deleting the blocks associated with the file." This is why the function that removes a directory entry is called unlink and not delete. The other type of link is called a symbolic link. With a symbolic link, the actual contents of the file (the data blocks) contains the name of the file that the symbolic link points to. In the example lrwxrwxrwx 1 root 7 Sep 25 07:14 lib -> usr/lib the filename in the directory entry is the three-character string lib and the 7 bytes of data in the file are usr/lib. The file type in the i-node would be S_IFLINK so that the system knows that this is a symbolic link. The i-node contains all the information about the file: the file type, the file's access permission bits, the size of the file, pointers to the data blocks for the file, and so on. Most of the information in the stat structure is obtained from the i-node. Only two items are stored in the directory entry: the filename and the i-node number. 4

Since the i-node number in the directory entry points to an i-node in the same filesystem, we cannot have a directory entry point to an i-node in a different filesystem. This is why the ln command (make a new directory entry that points to an existing file) can't cross filesystems. When renaming a file without changing filesystems, the actual contents of the file need not be moved - all that needs to be done is to have a new directory entry point to the existing i-node and have the old directory entry removed. For example, to rename the file /usr/lib/foo to /usr/foo, if the directories /usr/lib and /usr are on the same filesystem, the contents of the file foo need not be moved. This is how the mv command usually operates. We've talked about the concept of a link count for a regular file, but what about the link count field for a directory? Assume that we make a new directory in the working directory, as in $ mkdir testdir Figure 5 shows the result. Note in this figure we explicitly show the entries for dot and dot-dot. Figure 5. Sample Filesystem after Creating the Directory testdir The i-node whose number 2549 has a type field of "directory" and a link count equal to 2. Any leaf directory (a directory that does not contain any other directories) always has a link count of 2. The value of 2 is from the directory entry that names the directory (testdir) and from the entry for dot in that directory, The i-node whose number is 1267 has a type field of "directory" and a link count that is greater than or equal to 3. The reason we know the link count is greater than or equal to 3 is because minimally it is pointed to from the directory entry 5

that names it (which we don't show in Figure 5), from dot, and from dot-dot in the testdir directory, Notice that every directory in the working directory causes the working directory's link count to be increased by 1. 3 link and unlink Functions As we saw in the previous section, any file can have multiple directory entries pointing to its i-node. The way we create a link to an existing file is with the link function. int link (const char *existingpath, const char *newpath); Returns: 0 if OK, -1 on error This function creates a new directory entry, newpath, that references the existing file existingpath. If the newpath already exists an error is returned. The creation of the new directory entry and the increment of the link count must be an atomic operation 2). To remove an existing directory entry we call the unlink function. int unlink (const char *pathname); Returns: 0 if OK, -1 on error This function removes the directory entry and decrements the link count of the file referenced by pathname. If there are other links to the file, the data in the file is still accessible through the other links. The file is not changed if an error occurs. To unlink a file we must have write permission and execute permission in the directory containing the directory entry since it is the directory entry that we may be removing. Only when the link count reaches 0 can the contents of the file be deleted. One other condition prevents the contents of a file from being deleted - as long as some process has the file open, its contents will not be deleted. When a file is closed the kernel first checks the count of the number of processes that have the file open. If this count has reached 0 then the kernel checks the link count, and if it is 0, the file's contents are deleted. 2) An atomic operation, or atomicity, implies an operation that must be performed entirely or notatall 6

Example Figure 6 opens a file and then unlinks it. It then goes to sleep for 15 second before terminating. #include <fcntl.h> int main (void) { if (open("tempfile", O_RDWR) < 0) printf("open error\n"); if (unlink("tempfile") < 0) printf("unlink error\n"); printf("file unlinked\n"); sleep(15); printf("done\n"); exit(0); } Figure 6. Open a File and Then unlink It This property of unlink is often used by a program to assure that a temporary file it creates won't be left around in case the program crashes. The process creates a file using either open or creat and then immediately calls unlink. The file is not deleted, however, because it is still open. Only when the process either closes the file or terminates (which causes the kernel to close all its open files) is the file deleted. If pathname is a symbolic link, unlink references the symbolic link, not the file referenced by the link. The superuser can call unlink with pathname specifying a directory, but the function rmdir should be used instead to unlink a directory. 7