The Sun Network File System

Size: px
Start display at page:

Download "The Sun Network File System"

Transcription

1 The Sun Network File System Martin Herbort, Martin Karlsch Communication Networks Seminar WS 03/04 Hasso-Plattner-Institute for Software Systems Engineering Postbox , Potsdam, Germany {martin.herbort, 1 Introduction The Sun Network File System Participating Protocols Implementation Details Implementing NFS through VFS NFS Operation Example Statelessness and Performance Alternatives RFS AFS OSF/DFS SMB Outlook on NFS v Evaluation...22 Abstract We will introduce you to the Sun Network File System (NFS). Before getting into the implementation details, we will outline the benefits of network transparency, explain the possibilities that open to the user and discuss how NFS enables this idiom. We then classify the NFS protocol related to the common ISO/OSI reference model and have a look at both the enabling technologies and important NFS-based services. Originally, NFS was developed for UNIX-based machines so we provide more detailed information on the implementation in UNIX operating systems. As NFS is a stateless protocol, there arise some issues that will also be discussed. This includes security and performance questions. Additionally, we briefly characterise and evaluate some alternatives to NFS.(mh,mk) Keywords: AFS, DFS, Mount, network transparency, NFS, NIS, NLM, RFS, RPC, SMB, VFS. 1 Introduction Working in a networked environment opens the need for everyone accessing data from everywhere within that particular network. Typically, a networked environment consists of at least tens (if not hundreds) of computers that run under different operating systems and that differ in the technical equipment and in the purposes they are intended to be used for. It is clear that the probability of changes within this system depends on its size. But the user expects the system as a whole to appear unchanged every time he uses it. This also means that the file system structure he accesses must neither depend on where the data is really located, nor on the computer architecture. That is, he wants the network to be transparent. The Sun Network File System (NFS) is a distributed file system that provides the functionality for transparent access to remote computers [2, p.93pp]. It lets the user mount remote file systems as if they 1

2 were stored on the local machine. NFS is a file access protocol. Instead of transferring a required file to a local disk, as it is necessary for file transfer protocols like FTP, data can be viewed and manipulated in-place, i.e. on the remote machine effectively containing that file. NFS mainly relies on the Sun Remote Procedure Call (RPC) protocol [1, p.109]. This implies a client-server relationship between participating computers. Machines that export their file systems to the network are called NFS-servers. Machines that mount file systems exported by NFS-servers (import) are called NFS-clients [3, p.5]. A machine can be NFS-server and NFS-client at the same time. A computer running NFS cannot distinguish remote file systems from local ones [1, p.1]. Every operation on a file system mounted via NFS is resolved to a set of RPC calls. We will introduce RPC and explain other underlying protocols in a later paragraph ( 2.1). The operations defined by the NFS protocol can be divided into file, directory, symbolic link and special device operations [2, p. 120]. Due to different file system semantics in different computer architectures, most NFS implementations only work with a subset of these operations. On the other hand, interoperability between any two implementations is a must. See below but also 2.1 and 2.5. A considerable part of this paper deals with the implementation of NFS (2.2). Although version 3 (from June 1995) is not the most recently developed version, it is the base for this elaboration. Version 4 from April 2003 is still used rarely (e.g. implementation state is experimental in the upcoming LINUX kernel 2.6). It is important for client-server scenarios, whether clients and/or servers have to keep state when using a particular protocol or not. NFS is called a stateless protocol because a particular NFS operation in general does not depend on proceeding or succeeding NFS operations [2, p.122]. The term in general indicates that there are some stateful operations. In fact, caching of recently accessed data and locking of files are such stateful operations [3, p.3, p.9]. But all operations that build the core functionality (i.e. READ and WRITE) are kept stateless. There is a paragraph dedicated to the statelessness of NFS (2.3). It covers the problems that arise as well as how the NFS protocol deals with them. As described more precisely in the evaluation paragraph (2.6), we conclude that NFS is a good and easy-to-use solution to achieve network transparency within Local Area Networks (LANs), that do not differ much with respect to the system architecture used. In highly heterogeneous networks, it is difficult to reach interoperability. This is because different client implementations of a stateless protocol like NFS may make different assumptions related to server capabilities. Furthermore, some operations may need to be emulated that are not supported originally. The broadly used NFS version 3 suffers from bad performance when operating via the internet. Higher overall performance is one of the design goals of NFS version 4 [4, p.9]. Although mass storage is cheap today, it is worth to notice that NFS can also be used to realise clients without 2

3 built-in mass storage (diskless clients). Some figures show aspects of NFS operation using FMC modelling techniques. Although it is assumed that the reader of this paper is familiar with the fundamental concepts of FMC, we refer to [19] for further information.(mh) 2 The Sun Network File System 2.1 Participating Protocols To classify the protocols described below, the following figure shows the NFS protocol stack related to the ISO/OSI layers 1 to 7. Figure 1: NFS protocol stack The Sun Network File System itself is an application layer protocol. As mentioned above, NFS mainly resides on the Sun RPC (Sun Microsystems Remote Procedure Call) protocol. It is also referred to as ONC/RPC [2, p. 150]. RPC is used as the carrier of requests to a server where they are resolved to procedures (for details on RPC technology see the appropriate elaboration). NFS is implemented as a set of RPC-procedures each of whom works on either an object of type file (e.g. reading, writing files, changing file attributes) or of type file system (e.g. file lookups, static information on file system). As NFS depends on high throughput, the Sun designers decided to create NFS as a RPC protocol. This is because RPC was considered as simple and highly performant [5, p.6]. It corresponds to the communication layer within the ISO/OSI reference model. Interoperability across operating system borders was a requirement, too. Different architectures potentially use different byte orders (big/little endian). Also, architecture specific compilers could have different behavior when assigning data to memory segments. Though, there was the need for a standardized data presentation format. XDR (external Data Representation) is such a format. As the name already indicates, it corresponds to the presentation layer within the ISO/OSI reference model. It standardizes the way data is encoded. XDR was initially developed by Sun Microsystems and is used by NFS on presentation layer [2, p.13]. XDR data formats are specified in their own description language, the Remote Procedure Call Language (RPCL). All data types used by NFS and all procedures it provides are described in RPCL. This description language resembles the C programming language what facilitates the comprehension of the NFS procedures, their parameters and their return values. (mh) After we have described how RPC and XDR work, we will now discuss the 3

4 most important functions which are defined through the NFS core protocol. They have to be offered and made callable through RPC by a server which implements the NFS protocol. For an in detail description refer to [3]. NULL: Actually the simplest function. It does nothing and is only used for testing server response and measuring the time needed for that. GET/SETATTR: Used for getting/setting attributes of a file system object. LOOKUP: Searches a directory for a given filename and returns a corresponding file handle when found. ACCEES 1 : Determines the access privileges a user has to a specified file system object. CREATE: This procedure is used for creating a regular file. MKDIR: Is used for creating directories. RMDIR: Represents the counterpart to MKDIR which means it removes the directory which is passed as an argument. READ/WRITE: If a client wants to read from or write to a file, these functions are used. They have to be called with a valid filehandle. For a read operation the offset in the file and the number of bytes to read have to be passed to the function. As result, the read data is returned. The write 1 Introduced in NFS v3 to make handling of systems which implement file security through Access Control Lists easier.[6, p2] operation additionally needs the actual data to write as an argument and returns if the operation succeeded or failed. COMMIT 2 : Commits data which is still in the server cache to stable storage. REMOVE: If the client wants to delete a file on the server, it invokes the REMOVE procedure with the name of the file to delete. RENAME: As the name already implies, this function is used for renaming files and directories. FSSTAT: When the clients needs to retrieve volatile information about the file system state, this function should be used. FSINFO: For non volatile information about the file system state or general information about the NFS v3 server implementation, a client can call this procedure 3. Additionally there exist functions for creating and removing hard and soft symbolic links and functions like READDIRPLUS 4 which combine other operations. Every operation has to return an appropriated error message if the operations fails. E.g. imagine a REMOVE call is issued on a non existent file. The serve would return a file not found constant as the result for the RPC. If a 2 Also introduced in NFS v3 to allow synchronous file writes. 3 We use the term procedure and function as equivalent and don t differentiate between the meanings like it s for example done in the context of programming languages. 4 Introduced in NFS v3 to eliminate the LOOKUP calls when scanning a directory. It returns attributes and file handles at the same time. See [3] 4

5 server can not implement a function it is also possible to return a not implemented message. This could happen for example on file systems which do not support symbolic links when calling SYMLINK. Another important criterion which has to be met by the functions is that they are idempotent. That means, a RPC can be repeated as many times as the client wants, without causing any harm e.g. damage the file system. Also the return values should be the same. Idempotency will be explained in detail in 2.3.(mk) As we will show in the next part, accessing data over NFS requires the clients to obtain an initial filehandle. For this, the Mount protocol is used. E.g. on UNIX systems, the command mount jurassic:/export/test is used on clients to obtain an NFS filehandle for the directory /export/test on computer jurassic. The Mount protocol is one of many services that may run on a UNIX system and provides certain file system functionality to its clients. Every service is assigned to a port. This mapping typically is dynamic. That is, a portmapper service assigns a free port number to a service just when it starts up. This allows a more flexible use of the fixed set of ports. In that case, the portmapper is the only well-known service and has port number 111. The following figure shows the process of obtaining the initial filehandle described above. Time flows from top to bottom. The directory /export/data is the exported NFS file system. Hence, the initial filehandle is the filehandle representing this directory. Figure 2: Obtaining the initial filehandle. Source: [5, p.256]. For bigger networks, it is difficult to keep access rights consistent, because several clients from different domains may have the same name. The NIS (Network Information Service) protocol, also known as YP (Yellow Pages) protocol, is another application layer protocol that is used for UID/GID mapping consistency [5, p.231] and naming of files. When using NIS-based security, many clients connect to a single database to gain access rights for NFS file systems. A fundamental feature of nearly every file system is concurrent access to a single file. To avoid data inconsistencies, the idea of file locking is used. File locking algorithms manage multiple concurrent access to single files. NFS version 3 does not implement file locking. Though, file locking within NFS is based on another application layer protocol, the NFS Lock 5

6 Manager protocol (NLM) [1, p.194]. Like the NFS protocol itself, NLM is defined as a set of RPC procedures. NLM supports advisory locking. That is, any reader or writer of a file should acquire a lock on that file, but it is not enforced. An application following this rule is called cooperating, otherwise it is called noncooperating. Windows operating systems typically support only mandatory locking, i.e. every write or read system call implicitly results in a sequence of lock read/write unlock. That is, Windows applications are always cooperating. The UNIX operating system has a tradition of supporting only advisory locking [5, p.281]. As NFS was originally designed for UNIX systems, NLM implements an advisory locking rule. Hence, there is the possibility of data corruption, i.e. if a file locked by a (cooperating) Windows application is updated by a (noncooperating) UNIX application [5, p.281]. There are exclusive locks, also called write locks, preventing any other access to the locked file and nonexclusive, also called read locks, that permit clients to have their own nonexclusive lock on the same file. Locking, like caching, is a stateful mechanism. For more detailed information about NFS and statelessness see 2.3. To provide correct file system semantics, NFS servers must know which clients hold which locks. On the other hand, the clients must know, on which servers they hold locks for which files. There are two typical scenarios, a file locking protocol must provide solutions for [5, p.278]. The first one is loss of server state. An NFS server running the NLM protocol maintains a data structure for records representing locks that have been granted to clients. For performance reasons, this state is generally maintained in volatile storage, that is, the server s memory [5, p.278]. See 2.3 for details. The following remarks require the lock state to be stored in stable storage. In case of a server-side crash/reboot, all clients that held locks on that particular server have to be notified about the crash, reboot respectively. As the server has a table of associated clients (monitored hosts), it can easily inform them about the new situation, because the table of monitored hosts is stored on stable storage. This notification is done by the Network Status Monitor. The second one of the scenarios mentioned above is loss of client state. When a client crashes/reboots all locks it held (maybe on multiple servers) have to be freed, so that no file is unnecessarily locked. Therefore, if the crashed client recovers, its status monitor notifies every server in the table of associated servers (monitored hosts). After a loss of server state, there is a defined time period called grace period during which only requests to reestablish locks are granted [1, p.194]. As requests for new locks are never granted during this period, the requests for reestablishing locks implicitly have a higher priority. The reason for this is described in the following: Assumed, client A holds an exclusive lock on file x at server S. S crashes and recovers. S status monitor notifies all clients in the list of monitored hosts about the reboot, but meanwhile, client B 6

7 successfully requests a nonexclusive lock on file x. Now, A is notified about the reboot and tries to reclaim its lock on file x. This fails due to B s read lock. Although A originally possessed the lock earlier than B, it cannot reclaim it. To avoid this unfair behavior, a grace period is used. (mh,mk) 2.2 Implementation Details We will now focus on the implementation of NFS and the issues that arise. When you try to implement a protocol like NFS you have to consider a certain number of issues e.g.: How is compatibility with older Versions achieved? How does integration in the file system of the operating system work? How to handle server crashes on the client? These are by far not all questions which needed to be answered. The easiest problem to solve is multiple version support. The RPC protocol has support for versioning a service[3]. Server and client will use the highest version both sides support. The implementer should provide full backwards compatibility, when possible. Because NFS v3 was designed with good compatibility to NFS v2 in mind, the multiple version support can be added at very low costs 5. [6]. The next problem is permission checking. It has to be checked if the client has the rights to access a file or directory or to perform other operations. NFS does not provide an own security schema. It 5 NFS Version 3 only defines a revision to NFS Version 2; it does not provide a new model. Because of this, NFS Version 3 resembles NFS Version 2 in design assumptions, file system and consistency model, and method of recovering from server crashes. [6, chapter 4] has to rely on standard operating system protection mechanisms (AUTH_UNIX,AUTH_DES,AUTH_KERB ). The main problem with the UNIX style credential checking is that server and client have to share a common database of usernames 6 and groups or the server has to offer another way of username mapping. Also with a stateless service like NFS, the permissions whether the user is allowed to read or write a file must be checked on every RPC 7. This does not preserve the standard UNIX semantics, because normally, a check is only performed once on opening. The implementers have to workaround these problems. On most operating systems, a particular user (on UNIX, the UID 0) has access to all files, no matter what permission and ownership he has. This superuser permission may not be allowed on the server, since anyone who can become superuser on his client could gain access to all remote files.[3, p.99] Extra mappings must be provided here. Therefore, NFS servers typically map UID 0 to UID -2, what makes our superuser become a nobody user [5, p.232]. Typically, there is a file /etc/exports that contains information on access rights for the clients. Also the RPCs have some sort of security checking. Every RPC is validated before it is executed. This is done in the RPC layer and can not be influenced by NFS. In [3] and [8] different available 6 If we are technically correct we should talk about mapping username on server and client to the same uid and groups to the same gid. 7 Every RPC contains the user uid and a list of gid the user belongs to. 7

8 security flavours are explained. One of the main reasons for the development of NFS v3 was the need for better performance. The major bottleneck was caused through the inefficient write procedures. Because of the statelessness of the protocol (see 2.3) and the definition of the WRITE RPC in NFS v2, the client had to wait, until the data was written to stable storage before he could perform other actions(synchronous write throughput problem [6]). Normally it is only allowed to answer a client request after it is fulfilled. For example consider a WRITE RPC. After the server has received the call, it has to flush the data from the client to stable storage before answering the request. To improve performance, NFS v3 introduces asynchronous writes accompanied by the COMMIT procedure. The COMMIT call is needed to retain the statelessness. After an asynchronous write call, the server returns the control to the client immediately and caches the data to write. If the client decides to close 8 the file, it has to send a COMMIT. This ensures that everything is stored safely. The server must not reply until a safe state is reached. Asynchronous writes are most effective for large files. A client can send many WRITE requests, and then send a single COMMIT to flush the entire file to disk when it closes the file. This allows the server to do a single large write, which most file systems handle much more efficiently than series of small writes. For very large files, the server can flush data in the background so that most of it will already be on disk when the 8 As the protocol is inherently stateless and no close operation exists, close is meant in this case only for the state kept on the client side. COMMIT request arrives. [6, chapter 4.3.2]. It is important to mention that the server and respectively the clients are free to support asynchronous writes or not. For example in the case that the client is not able to buffer enough data to support server crash recovery with asynchronous writes. This can happen because of memory constraints. The client can always enforce synchronous writes with a special flag passed to the WRITE requests. Also, the server is allowed to flush the data immediately. The behaviour depends on the implementation. The concept of committed writes is also consistent to the design philosophy of simple server and powerful clients behind NFS. As the protocol is stateless, the client has to keep an in-memory image of uncommitted data. If no such image is kept, the data would be lost if the server crashes before a successful commit. To enforce data consistency, each reply for WRITE and COMMIT has a write verifier attached [6]. The verifier is a simple number which has to be checked by the client on a COMMIT request. (The number is a unique value which has to be changed by the server after a crash.) All numbers returned by the WRITE calls have to be equal to the number returned by the COMMIT request. If that is not the case, the client has to assume that the server has crashed during one of the last operation and all uncommitted data has to be resent. The client retains an image of all uncommitted data which makes resending possible. The number is normally generated at boot time of the 8

9 server 9. (mk) Another problem to consider is that a single computer can be NFS client and server at the same time. This may lead to a problem called mountpoint crossing. When a client walks along the hierarchy of a mounted NFS file system (exported by server A) using LOOKUP requests, it could pass a mountpoint, that server A has another file system mounted on (e.g. exported by server B). When the client needs a filehandle for a file within that exported exported file system, server A could simply transmit the filehandle (related to server B) that it uses to access this file. But the client will continue contacting A for file access, although the filehandle it obtained from A is related to server B. An NFS server cannot distinguish reliably between these two types of filehandles [5, p.228]. The filehandle passed to the client cannot be tagged with additional data, because NFS version 3 filehandles are limited in size, although they are flexible. Therefore, that tagging will not work indefinitely. Also, server A could maintain a filehandle mapping table, that assigns the appropriate server to each filehandle appearing in a client s request. This strategy implies additional state information that has to be stored, recovered and transferred in particular scenarios (e.g. server crash). Additionally, the client itself can easily mount directories from server B, provided it has sufficient access rights. Therefore, the problem of mountpoint crossing is solved by mounting the exported file system of server B to the client s local file system. When a client exports a file system to a server and that server in return exports a file system to the client, the exported file hierarchies can be nested one into the other. That is called a namespace cycle. To avoid infinite file hierarchy traversal, NFS servers hide those parts of an exported file system to a client that point back to the local file system of that particular client. Not directly an implementation detail but worth to mention is the possibility to create networks with diskless clients. On some operating systems (including SunOS 4.0 and Solaris), diskless clients are supported through NFS [1, p.132pp]. Diskless clients are computers without a built-in mass storage device. To provide this functionality, the operating system s memory manager must do its work in a way that allows a mapping from the content of the swap device to files. Furthermore, an NFS file system must be mountable as root directory. Setting up and configuring a diskless client requires several steps that depend on processor and platform architecture. Diskless clients need a server connection to be able to work. That is, the server must contain a copy of the appropriate operating system for every CPU-type of the clients. Additionally, platform-specific executables must be held on the server for every platform, the clients are based on.(mh) 9 Servers use normally their boot time as write verifier. It is unique on every boot. 9

10 2.2.1 Implementing NFS through VFS On UNIX based operating systems, the NFS protocol is implemented with help of VFS (Virtual File System or sometimes also Virtual File System Switch [11]). It was introduced by Sun around 1985 [12] and first implemented in SunUNIX and BSD [13]. Today a VSF like file system is implemented by most UNIX kernels and also the most popular UNIX clone Linux [13]. The purpose of VFS is to make other file systems (non native to the system) to appear as a single one [1, p.113]. A single interface is offered to accommodate diverse file system types cleanly. [14, p.2]. All file system specific functions and dependencies are hidden. On UNIX systems there is a single interface to operate on files. It is named the syscall layer and servers as hook point for VFS. It translates the system calls to the appropriate calls for the underlying file system. The most important fact is that the UNIX file system semantics are preserved weather you operate on ext3, NFS or even FAT 10. Every system which implements the VFS interface can be easily plugged into Unix file hierarchy. With the help of VFS a two level transparency is added. Level one is that to the user and the system it appears that everything is attached to the local file system and the real nature/location is hidden. As mentioned before this is achieved by the generic set of operations which are offered and described later. 10 On primitive file systems like FAT not all operations can be supported natively e.g. symbolic links. But it is still possible to to use FAT via VFS. Level two is the abstraction from the underlying architecture of the NFS server type achieved by a special file system object (vnode)[12]. It hides the actual physical representation of the data. VFS divides two types of operations. Actions that operate on the file system such as getting the amount of free space are called VFS operations and actions that operate on file or directories e.g. creating a directory or writing to a file, vnode operations. The vnode interface translates the operation to the underlying file system call. The implementation is responsible for this translation such as returning the timestamp of the file in the VFS data format. For every open file or directory a vnode is created. Open means in this case in active use. For example a vnode of an open local file points to an inode 11. Every vnode contains a set of attributes and operations which can be applied to the node. The most important attributes and operations are: Generic operations: vop_getattr: retrieves an attribute. vop_setattr: sets an attribute. Directory only operations: vop_mkdir: creates a directory. 11 Inode are the base entity for the UNIX file system. The difference between a vnode and an inode is where it's located and when it's valid. Inodes are located on disk and are always valid because they contain information that is always needed such as ownership and protection. Vnodes are located in the operating system's memory, and only exist when a file is opened. However, just one vnode exists for every physical file that is opened. 10

11 vop_rmdir: removes a directory. vop_create: creates a file. vop_remove: removes a file. vop_symlink: creates a symbolic link. File only operations vop_getpages: reades bytes from a file. vop_putpages: writes bytes to a file. Attributes NFS or local. Vnodes simply generalize the interface to file objects. There are many routines in the vnode interface that correspond directly to procedures in the NFS protocol, but the vnode interface also contains implementations of operating system services such as mapping file blocks and buffer cache management.[1, p.111] The NFS RPC protocol is a specific realization of the vnode interface. owneruserid : identification number of the owning user. ownergroupid: identification number of the owning group. filesize: size of the file 12. accesstime: time at which the last access to the file/directory occurred. modifytime: time at which the file/directory was changed the last time. A valid NFS implementation which should be used as a VFS plug-in has to implement most of the vnode operations. In Figure 3 the relationship between NFS and VFS is shown. There are many existing implementations of the vnode/vfs interface. [12&13]. From the preceding descriptions, it is fairly clear how the basic Unix system calls map into NFS RPC calls. It is important to note that the NFS RPC protocol and the vnode interface are two different things. The vnode interface defines a set of operating system services that are used to access all file systems, 12 It should be clear that for directories the filesize is zero. Figure 3 VFS-NFS relationship on a UNIX system. Because VFS is specific to UNIX we have to mention that a similar way exists to integrate NFS into Windows NT or higher. Most NFS implementations for Windows are able to map a NFS directory to a drive letter selected by the user. Because the NTFS file system supports soft and hard symbolic links it is also possible to integrate a remote directory seamlessly into the Windows file system. [15]. To integrate NFS into 11

12 Windows transparently also a mapping of UNIX user IDs to a Windows user account has to occur. Windows uses Access Control Lists for permission checking. ACL s allow a more fine grained permission control than the UNIX rwx schema. For a good explanation we refer to [15]. With ACL s the rwx schema can be easily emulated on NFS Windows servers and clients. As the NFS v3 protocol does not support ACL, Windows users are restricted to an emulated UNIX rwx schema. An annoying fact is that only commercial NFS implementations for Windows exist [16] Because these implementations are proprietary we cannot provide a discussion of the internal functionality.(mk) NFS Operation Example We will now provide an example how a dialog between a NFS server and a client may look like. After we have covered the NFS protocol and some implementation details this should give you a good understanding how NFS works. Assumed we are on a UNIX system and execute the following sequence of commands. mount ourserver:/export/vol0 /mnt dd if=/mnt/home/data bs=32k count=1 of=/dev/null The mount command will mount a remote file system located on the server ourserver. Then the first 32KByte from the remote file data a read. This will result in the following RPC sequence. PORTMAP C GETPORT (MOUNT) PORTMAP R GETPORT MOUNT C Null MOUNT R Null MOUNT C Mount /export/vol0 MOUNT R Mount OK PORTMAP C GETPORT (NFS) PORTMAP R GETPORT port=2049 NULL NULL FSINFO FH=0222 FSINFO OK GETATTR FH=0222 GETATTR OK LOOKUP FH=0222 home LOOKUP OK FH=ED4B LOOKUP FH=ED4B data LOOKUP OK FH=0223 ACCESS FH=0223(read) ACCESS OK (read) READ FH=0223 at 0 for READ OK (32768 bytes) At the beginning the portmapper 12

13 negotiates a port for the mount protocol. After the port is obtained the mount protocol mounts /export/vol1. Now a port for the core NFS protocol is retrieved. Over the communication port the first called procedure is NULL. The file system info and the attributes are now ascertained. Finally the directory home is looked up and the file data, too. In the following ACCESS call the rights are checked and with READ the first 32KByte are read.(mk) 2.3 Statelessness and Performance Statelessness in a client-server architecture means that each request from a client to a server is stand-alone, that is no information on the clients state has to be kept by the server [5, p.87pp]. The NFS protocol can be called stateless insofar that it does not require keeping any state. But for performance reasons, both most client and server implementations keep some state information. One of the benefits of the statelessness of the NFS protocol is simple server recovery. If the server crashes while communicating with a client, the client will just have to retransmit the last request. Assumed, that NFS was a completely stateful protocol, the server s state would have to be rebuilt or operations requested by clients would simply have to fail. Statelessness becomes difficult because NFS e.g. does not specify OPEN or CLOSE semantics on the server. As it is possible for POSIX compliant clients to remove open files, there is the possibility of a certain type of data inconsistency called Last Close problem [5, p.227]. This is solved as follows: Assumed, there are two processes represented by process IDs 1234 and 3421 on one NFS client working on file a.txt. Both 1234 and 3421 have opened a.txt. Now, 1234 removes a.txt. This remove effectively is a rename operation that follows certain naming conventions can continue working on a.txt because the filehandle for a.txt is still valid. After the last reference to a.txt has been deleted, the NFS client code physically removes the renamed version of a.txt. Figure 4 illustrates the Last Close problem. Figure 4: The Last Close Problem: Removing opened files may be allowed by POSIX compliant clients. Source: [5, p.227]. Another aspect of statelessness is the idea of idempotent operations. An idempotent operation is one that returns the same result if it is repeated [5, p.88]. Given a READ request, that is repeated with the same arguments. It always returns the same results. This is clear because no data is changed. But what about a REMOVE request? Assumed, that one such request 13

14 was executed correctly by the server, but its response to the client does not arrive due to packet loss. The response would have timed-out on the client. Then, the client would retransmit the REMOVE request. Now, everything works correctly, but the client would get an operation failed message, because the file mentioned in the initial request does not exist. To avoid such behavior, NFS servers implement a duplicate request cache [5, p.235pp]. Caching is not compatible with the notion of statelessness. Every RPC call has a unique transaction ID. The cache s entries are such RPC calls indexed by their transaction IDs. The entries also include the initially generated reply. The server has to consider specified caching policy: If the cache entry indicates a request currently being processed, nothing particular has to be done. If the cache entry indicates that the server replied to the request recently, the request is ignored. It is assumed, that the request and the reply crossed each other on the network. Furthermore, the client will retransmit the request again, if that assumption was wrong. If the cache entry indicates that the server replied, but not recently, the cached reply is sent. When using a duplicate request cache, even REMOVE requests are idempotent. As mentioned earlier, the NFS protocol does not require the server to keep that form of state. But in doing so, client implementations are kept less complicated than they would be, if they had to handle duplicate requests themselves. As NFS servers have to keep only little state on their clients, it is easy to setup a highly available network of NFS servers (HA-NFS, high-availability NFS [5, p.89pp]). Every server can resume another s work, provided all state information (caches, locks) can be transferred quickly. This requires the duplicate request cache and file locking information be stored on stable (secondary) storage. As mentioned in 2.1, the NFS Lock Manager generally holds that information in volatile memory. This does not meet the requirements of HA-NFS servers. But rather than rely on stable storage for recording lock state (and affecting server performance), a more practical scheme is to handle the failover as a fast crash and recovery [to the standby server] so that clients will reclaim their locks [5, p.90]. The overall performance of NFS is influenced by a number of factors. Of course, the hardware components (e.g. memory, CPU, network interfaces) of NFS servers and clients play a considerable role. But, when benchmarking NFS, the most important factor is the operation mix (workload) of the clients. Figure 5 shows typical workloads. The type and size of the caches have Figure 5: Typical NFS workloads. Source: [5, p.418]. 14

15 great impact on the performance. User B s workload shows that the ratio of READ requests is much smaller then in A s workload. This may indicate a bigger cache on B s machine. There are several benchmarks for NFS file systems. The SFS 2.0 (System File Server) is the official benchmark for NFS version 3 [5, p.430]. It was developed by SPEC (Standard Performance Evaluation Corporation). It is also used to determine the performance of NFS cluster servers. In general, the most common NFS benchmarks are based on a specified set of NFS operations to test e.g. server load vs. response time. NFS was originally designed for high-bandwidth and low-latency networks like LANs. As UDP (User Datagram Protocol, connectionless) performs well within such an environment, it was chosen for the transport layer protocol instead of TCP (Transmission Control Protocol, connection oriented). WANs (Wide Area Networks) like the Internet have a low bandwidth and high latency. That is one of the main reasons why the Internet is based on TCP. Typically, TCP clients and servers monitor the link quality and adjust the transmission rate. UDP communication partners send packets of the same size, even if the network is congested (see elaboration Congestion Control Algorithms for details on dealing with congested networks). TCP increases the overall performance if it is used for wide area networks, that are typically lossy. That is, single packets get lost often. If single packets get lost using a TCP connection, only these specific packets need to be retransmitted rather than the whole operation. This means a performance improvement. The impact on NFS if TCP is used at transport layer highly depends on the workload, that is the frequency of operations of different types. Though, a disadvantageous workload may be responsible for a bad performance if NFS is used over TCP. TCP is an inherently stateful protocol. Though, using stateless NFS over stateful TCP seems to be a contradiction. But remember that the need of TCP for keeping the state is hidden to higher level protocols like NFS because of the OSI architecture. For WANs it is recommended to use NFS over TCP, see e.g. [1, p.384]. As TCP implies some overhead if NFS is mainly used in LANs (Local Area Networks) and because specific NFS client or server implementations may not support TCP, NFS over UDP is still state of the art. The requirement for high-speed networks is typical for all file access protocols. Therefore, using NFS over a 56Kbps modem connection is not very effective. The usage of a file transfer protocol would be more appropriate! For clients with a very limited network connection, it is easier to transfer a file once, modify it and transfer it back to the file transfer server. The reason for this is that the overhead implied by file access protocol requests are no longer negligible due to the small bandwidth. WebNFS is an extension to the NFS protocol that facilitates using NFS file systems over the internet and improves the performance. Instead of connecting to a portmapper to determine the port for the Mount daemon that will then transmit an initial filehandle, WebNFS 15

16 servers provide a public filehandle. Servers that implement the WebNFS extensions mark directories as public, associating them with the public filehandle. Clients specify an NFS URL to connect to exported resources. E.g. nfs://gamma:2010/x/y indicates the resource y in the directory x of server gamma, whose NFS server application listens to port WebNFS implementations typically use TCP on transport layer because the transmission is more reliable compared to UDP and because TCP is handled more kindly by network firewalls [5, p.440]. Transmission overhead is reduced because the WebNFS protocol defines a multicomponent LOOKUP request, that is, a LOOKUP request can contain a complete pathname instead of just one part. But even WebNFS cannot completely solve performance issues when using NFS over TCP: NFS depends on other protocols that may not have well-known ports, making their usage difficult because of the firewall s port policy. Figure 6 illustrates this situation. In NFS version 4, the port for NFS is the only port to know when clients want to access remote file systems via NFS. The reason for this is that NFS version 4 itself implements features like locking and mounting. This is shown in the figure by the separate client-server system at the bottom. Figure 6: Connecting to an NFS server through a firewall (FMC static structure). NFS version 4 integrates much functionality that, in version 3, is integrated in other protocols. Therefore, NFS version 4 will increase performance also when used over the Internet (see 2.5 for more details). Other steps taken to speed up NFS version 3 follow the read ahead and write behind strategy. These strategies rely on asynchronous requests. Clients that begin to read a file tend to read the whole file. To speed up server response time, a readahead strategy is used. The server assumes that the client will later request another part of the file and caches the whole file content. Succeeding read requests can be answered more quickly. Corresponding to this, a write-behind strategy is used to speed up server response time after write requests. In NFS version 3, write requests are synchronous. This causes the client to block until the data is written to server and the operation returns. Waiting for the disk to complete the write operation takes long and therefore, write requests 16

17 are expensive. When using write-behind, those requests are asynchronous, i.e. the operation immediately returns to the client. The server will flush the data to disk later. The disadvantage is that, when the client sends more NFS requests than the server can handle, requests dropped by the server must be retransmitted. This implies lower throughput in heavy-load situations. 2.4 Alternatives RFS In the 1980 s, the Remote File Sharing (RFS) protocol was a competitor to NFS [5, p.361pp]. It was part of AT&T s UNIX System V release 3 and aimed to preserve full UNIX file system semantics for remote files [5. p.363]. Therefore, it only supports UNIX clients, limiting the operation of RFS to pure UNIX networks. This limitation and the fact that RFS is a stateful protocol is the reason for that problems like the Last Close problem (see 2.3), that NFS cannot solve cleanly, do not occur. RFS, like NFS version 4, directly supports file locking. The remote access is extended to devices like tape drives. The main difference to NFS is that RFS is based on a remote system call paradigm. Every system call on the client (open, close, read, write, ) is sent via RPC to the server s system call interface. There is no set of specified procedures like in NFS. As UNIX system calls depend on the context of the user process, a context that corresponds to the original context on the client has to be set up on the server. This results in poor performance [5. p.4]. When mounting an NFS exported file system, the server name and pathname must be provided. RFS uses a name server, to advertise a file system with a resource name. Therefore, when mounting an RFS file system, only the resource name is necessary. An example: NFS: mount server:/home/jane /mnt RFS: mount d HOMEJANE /mnt Traversing file system structures may cross into an RFS file system. Then, the remaining part of the pathname is sent to the RFS server as a whole. The evaluation of such a multi-component pathname is difficult because, e.g. the usage of.. may lead back into the client s local file system. Therefore, NFS evaluates pathnames one-component-at-a-time. This requires a new NFS request for each component of the pathname, but local paths are detected more reliably. RFS uses only rudimentary security algorithms: There are tables for UID/GID mapping and password-protected server connects. NFS makes use of the Network Information Service (NIS, see 2.1) and with version 4 integrates extended security flavors like Kerberos version 5.(mh) AFS AFS (Andrew File System) is another 17

18 alternative to NFS [5, p.364pp]. It was originally designed to distribute file systems across the campus of the Carnegie-Mellon University. It is based on proprietary RPC mechanism called Rx. AFS, like NFS, supports UNIX file system semantics. Clients access files in a global namespace, like in RFS. Within NFS, global namespaces require the usage of an additional naming service, typically NIS (see above). One of AFS s main features is the aggressive caching policy. All data a client accesses is transferred to a cache on the client, including attributes and symbolic links. Therefore, AFS servers scale well when the number of clients heavily increases. The Andrew [file] system is targeted to span over 5000 workstations [18, p.550]. When data changes on the server, a callback message is sent to all clients that hold a cached copy of that data, informing them of this inconsistency. A client that writes its modified cached data back to the server overwrites the changes of any other client before it. This is similar to the lost update problem known from database systems. AFS always uses the integrated Kerberos security system to authenticate users and ACLs (Access Control Lists) to authorize file access to authenticated users. NFS version 3 hardly provides any security systems itself, but instead implements standardized interfaces that allow different existing security systems to be used, such as, for example, Kerberos [17, p.644]. AFS is the origin of the Coda (Constant data availability) file system. That file system allows clients to be temporally disconnected from the file servers without a loss of accessibility to the files [17, p.605]. This functionality is provided by the caching strategies described above. Especially mobile users can benefit from this support [17, p.575].(mh) OSF/DFS The DFS file system, as a commercial implementation of AFS, is an enhancement to AFS [5, p.369]. It has been developed by the Open Software Foundation (OSF) as part of the Distributed Computing Environment (DCE). DFS addresses the problem of cache inconsistency with a server-based token scheme. A token manager grants tokens to clients that want to perform file operations of different types. Tokens represent access modes to data. Tokens are defined by classes and types. E.g. a token from class data can be of write or read type. Tokens of different classes do not conflict, tokens of the same class do. When a client wants to write to a file, it tries to acquire a data write token. If another client already holds such a token, there is a conflict. The token manager detects those conflicts and notifies all clients with conflicting tokens (revocation request). They will flush the data back to the server before sending a response to the server s message. Tokens are associated with a lifetime value. When it expires, the server does not have to send a revocation request to the owning client, because the client knows that it has to write the data back in time, thus avoiding unnecessary network traffic. As in NFS, 18

19 there exists a grace period after a server crash during which only requests for reclaiming tokens are allowed. Any request for new tokens is denied. As DFS is directly derived from AFS, its advantages and disadvantages persist within DFS.(mh) SMB SMB (Server Message Block) is a file access protocol native to DOS and Windows operating systems. One of the design goals was the preservation of DOS/FAT file system semantics. There is an enhanced version of SMB called CIFS (Common Internet File System). The CIFS file system is native to most of the Windows series operating systems. Therefore, setting up and administering file sharing in a Windows network is much easier with CIFS than with NFS. The Samba server is a program suite that includes CIFS-based software for UNIX servers. It provides file and print services to PC clients. To achieve this, it is sufficient to install Samba on a UNIX server, rather than to install a software on multiple clients, as it is necessary with PCNFSD. PCNFSD is a client-side extension to NFS that provides authentication and printing functionality especially to DOS clients.(mh) 2.5 Outlook on NFS v4 In this paper we mainly described NFS up to the version 3. The reason is that version 3 is far more wide spread and used than the new version 4. Although implementations for every major operating system exist. That is not the case for the new version. Actually only an experimental implementation for FreeBSD is available which based on RFC3530[4]. A working draft of the version 4 protocol specification is also submitted to the Internet Engineering Steering Group for consideration as a Proposed Standard[7]. Version 4 is inherently different from all previous versions. The problems which are addressed are: The improvement of the functionality over the internet. The handling of firewalls should be improved, the performance in case of low bandwidth and high latency should be improved and servers should scale better on large number of clients. A stronger security checking should be provided. It should be ensured that the client fulfils at least a server defined minimal security contract. A better multi platform support should be added. NFS v3 was relatively close tied to the UNIX semantics. How are these goals achieved? The biggest similarities with previous versions are that version 4 is still based on Remote Procedure Calls and also the External Data Representation mechanisms. Everything else has drastically changed. To satisfy the 19

NFS Design Goals. Network File System - NFS

NFS Design Goals. Network File System - NFS Network File System - NFS NFS Design Goals NFS is a distributed file system (DFS) originally implemented by Sun Microsystems. NFS is intended for file sharing in a local network with a rather small number

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Today l Basic distributed file systems l Two classical examples Next time l Naming things xkdc Distributed File Systems " A DFS supports network-wide sharing of files and devices

More information

DISTRIBUTED FILE SYSTEMS & NFS

DISTRIBUTED FILE SYSTEMS & NFS DISTRIBUTED FILE SYSTEMS & NFS Dr. Yingwu Zhu File Service Types in Client/Server File service a specification of what the file system offers to clients File server The implementation of a file service

More information

OPERATING SYSTEM. Chapter 12: File System Implementation

OPERATING SYSTEM. Chapter 12: File System Implementation OPERATING SYSTEM Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management

More information

Chapter 12 File-System Implementation

Chapter 12 File-System Implementation Chapter 12 File-System Implementation 1 Outline File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Log-Structured

More information

Chapter 11: Implementing File Systems

Chapter 11: Implementing File Systems Chapter 11: Implementing File-Systems, Silberschatz, Galvin and Gagne 2009 Chapter 11: Implementing File Systems File-System Structure File-System Implementation ti Directory Implementation Allocation

More information

Chapter 11: Implementing File-Systems

Chapter 11: Implementing File-Systems Chapter 11: Implementing File-Systems Chapter 11 File-System Implementation 11.1 File-System Structure 11.2 File-System Implementation 11.3 Directory Implementation 11.4 Allocation Methods 11.5 Free-Space

More information

Chapter 11: File System Implementation

Chapter 11: File System Implementation Chapter 11: File System Implementation Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

Chapter 11: File System Implementation

Chapter 11: File System Implementation Chapter 11: File System Implementation Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

Background. 20: Distributed File Systems. DFS Structure. Naming and Transparency. Naming Structures. Naming Schemes Three Main Approaches

Background. 20: Distributed File Systems. DFS Structure. Naming and Transparency. Naming Structures. Naming Schemes Three Main Approaches Background 20: Distributed File Systems Last Modified: 12/4/2002 9:26:20 PM Distributed file system (DFS) a distributed implementation of the classical time-sharing model of a file system, where multiple

More information

File-System Interface

File-System Interface File-System Interface Chapter 10: File-System Interface File Concept Access Methods Directory Structure File-System Mounting File Sharing Protection Objectives To explain the function of file systems To

More information

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) Dept. of Computer Science & Engineering Chentao Wu wuct@cs.sjtu.edu.cn Download lectures ftp://public.sjtu.edu.cn User:

More information

Chapter 11: Implementing File Systems

Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems Operating System Concepts 99h Edition DM510-14 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation

More information

Distributed Systems. Distributed File Systems. Paul Krzyzanowski

Distributed Systems. Distributed File Systems. Paul Krzyzanowski Distributed Systems Distributed File Systems Paul Krzyzanowski pxk@cs.rutgers.edu Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License.

More information

416 Distributed Systems. Distributed File Systems 1: NFS Sep 18, 2018

416 Distributed Systems. Distributed File Systems 1: NFS Sep 18, 2018 416 Distributed Systems Distributed File Systems 1: NFS Sep 18, 2018 1 Outline Why Distributed File Systems? Basic mechanisms for building DFSs Using NFS and AFS as examples NFS: network file system AFS:

More information

Chapter 7: File-System

Chapter 7: File-System Chapter 7: File-System Interface and Implementation Chapter 7: File-System Interface and Implementation File Concept File-System Structure Access Methods File-System Implementation Directory Structure

More information

NFS: Naming indirection, abstraction. Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency

NFS: Naming indirection, abstraction. Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency Local file systems Disks are terrible abstractions: low-level blocks, etc. Directories, files, links much

More information

Chapter 11: Implementing File Systems

Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

Chapter 12: File System Implementation

Chapter 12: File System Implementation Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

Distributed File Systems: Design Comparisons

Distributed File Systems: Design Comparisons Distributed File Systems: Design Comparisons David Eckhardt, Bruce Maggs slides used and modified with permission from Pei Cao s lectures in Stanford Class CS-244B 1 Other Materials Used 15-410 Lecture

More information

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review COS 318: Operating Systems NSF, Snapshot, Dedup and Review Topics! NFS! Case Study: NetApp File System! Deduplication storage system! Course review 2 Network File System! Sun introduced NFS v2 in early

More information

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University Chapter 11 Implementing File System Da-Wei Chang CSIE.NCKU Source: Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University Outline File-System Structure

More information

Filesystems Lecture 13

Filesystems Lecture 13 Filesystems Lecture 13 Credit: Uses some slides by Jehan-Francois Paris, Mark Claypool and Jeff Chase DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh,

More information

Chapter 11: Implementing File

Chapter 11: Implementing File Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

Chapter 10: File System Implementation

Chapter 10: File System Implementation Chapter 10: File System Implementation Chapter 10: File System Implementation File-System Structure" File-System Implementation " Directory Implementation" Allocation Methods" Free-Space Management " Efficiency

More information

Distributed Systems. Hajussüsteemid MTAT Distributed File Systems. (slides: adopted from Meelis Roos DS12 course) 1/25

Distributed Systems. Hajussüsteemid MTAT Distributed File Systems. (slides: adopted from Meelis Roos DS12 course) 1/25 Hajussüsteemid MTAT.08.024 Distributed Systems Distributed File Systems (slides: adopted from Meelis Roos DS12 course) 1/25 Examples AFS NFS SMB/CIFS Coda Intermezzo HDFS WebDAV 9P 2/25 Andrew File System

More information

Distributed File Systems. CS432: Distributed Systems Spring 2017

Distributed File Systems. CS432: Distributed Systems Spring 2017 Distributed File Systems Reading Chapter 12 (12.1-12.4) [Coulouris 11] Chapter 11 [Tanenbaum 06] Section 4.3, Modern Operating Systems, Fourth Ed., Andrew S. Tanenbaum Section 11.4, Operating Systems Concept,

More information

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition Chapter 11: Implementing File Systems Operating System Concepts 9 9h Edition Silberschatz, Galvin and Gagne 2013 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory

More information

OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD.

OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD. OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD. File System Implementation FILES. DIRECTORIES (FOLDERS). FILE SYSTEM PROTECTION. B I B L I O G R A P H Y 1. S I L B E R S C H AT Z, G A L V I N, A N

More information

File-System Structure

File-System Structure Chapter 12: File System Implementation File System Structure File System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Log-Structured

More information

COS 318: Operating Systems. Journaling, NFS and WAFL

COS 318: Operating Systems. Journaling, NFS and WAFL COS 318: Operating Systems Journaling, NFS and WAFL Jaswinder Pal Singh Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Topics Journaling and LFS Network

More information

Chapter 12: File System Implementation. Operating System Concepts 9 th Edition

Chapter 12: File System Implementation. Operating System Concepts 9 th Edition Chapter 12: File System Implementation Silberschatz, Galvin and Gagne 2013 Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods

More information

Operating Systems Design 16. Networking: Remote File Systems

Operating Systems Design 16. Networking: Remote File Systems Operating Systems Design 16. Networking: Remote File Systems Paul Krzyzanowski pxk@cs.rutgers.edu 4/11/2011 1 Accessing files FTP, telnet: Explicit access User-directed connection to access remote resources

More information

Chapter 12: File System Implementation

Chapter 12: File System Implementation Chapter 12: File System Implementation Silberschatz, Galvin and Gagne 2013 Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods

More information

Filesystems Lecture 11

Filesystems Lecture 11 Filesystems Lecture 11 Credit: Uses some slides by Jehan-Francois Paris, Mark Claypool and Jeff Chase DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh,

More information

416 Distributed Systems. Distributed File Systems 2 Jan 20, 2016

416 Distributed Systems. Distributed File Systems 2 Jan 20, 2016 416 Distributed Systems Distributed File Systems 2 Jan 20, 2016 1 Outline Why Distributed File Systems? Basic mechanisms for building DFSs Using NFS and AFS as examples NFS: network file system AFS: andrew

More information

DFS Case Studies, Part 1

DFS Case Studies, Part 1 DFS Case Studies, Part 1 An abstract "ideal" model and Sun's NFS An Abstract Model File Service Architecture an abstract architectural model that is designed to enable a stateless implementation of the

More information

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition,

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition, Chapter 17: Distributed-File Systems, Silberschatz, Galvin and Gagne 2009 Chapter 17 Distributed-File Systems Background Naming and Transparency Remote File Access Stateful versus Stateless Service File

More information

Category: Informational October 1996

Category: Informational October 1996 Network Working Group B. Callaghan Request for Comments: 2054 Sun Microsystems, Inc. Category: Informational October 1996 Status of this Memo WebNFS Client Specification This memo provides information

More information

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3.

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3. CHALLENGES Transparency: Slide 1 DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems ➀ Introduction ➁ NFS (Network File System) ➂ AFS (Andrew File System) & Coda ➃ GFS (Google File System)

More information

Module 17: Distributed-File Systems

Module 17: Distributed-File Systems Module 17: Distributed-File Systems Background Naming and Transparency Remote File Access Stateful versus Stateless Service File Replication Example Systems Operating System Concepts 17.1 Silberschatz

More information

Chapter 11: File-System Interface

Chapter 11: File-System Interface Chapter 11: File-System Interface Silberschatz, Galvin and Gagne 2013 Chapter 11: File-System Interface File Concept Access Methods Disk and Directory Structure File-System Mounting File Sharing Protection

More information

Category: Informational October 1996

Category: Informational October 1996 Network Working Group B. Callaghan Request for Comments: 2055 Sun Microsystems, Inc. Category: Informational October 1996 Status of this Memo WebNFS Server Specification This memo provides information

More information

Distributed Objects and Remote Invocation. Programming Models for Distributed Applications

Distributed Objects and Remote Invocation. Programming Models for Distributed Applications Distributed Objects and Remote Invocation Programming Models for Distributed Applications Extending Conventional Techniques The remote procedure call model is an extension of the conventional procedure

More information

CS307: Operating Systems

CS307: Operating Systems CS307: Operating Systems Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building 3-513 wuct@cs.sjtu.edu.cn Download Lectures ftp://public.sjtu.edu.cn

More information

ò Server can crash or be disconnected ò Client can crash or be disconnected ò How to coordinate multiple clients accessing same file?

ò Server can crash or be disconnected ò Client can crash or be disconnected ò How to coordinate multiple clients accessing same file? Big picture (from Sandberg et al.) NFS Don Porter CSE 506 Intuition Challenges Instead of translating VFS requests into hard drive accesses, translate them into remote procedure calls to a server Simple,

More information

NFS. Don Porter CSE 506

NFS. Don Porter CSE 506 NFS Don Porter CSE 506 Big picture (from Sandberg et al.) Intuition ò Instead of translating VFS requests into hard drive accesses, translate them into remote procedure calls to a server ò Simple, right?

More information

What is a file system

What is a file system COSC 6397 Big Data Analytics Distributed File Systems Edgar Gabriel Spring 2017 What is a file system A clearly defined method that the OS uses to store, catalog and retrieve files Manage the bits that

More information

Chapter 11: Implementing File Systems

Chapter 11: Implementing File Systems Silberschatz 1 Chapter 11: Implementing File Systems Thursday, November 08, 2007 9:55 PM File system = a system stores files on secondary storage. A disk may have more than one file system. Disk are divided

More information

Chapter 11: File System Implementation. Objectives

Chapter 11: File System Implementation. Objectives Chapter 11: File System Implementation Objectives To describe the details of implementing local file systems and directory structures To describe the implementation of remote file systems To discuss block

More information

Distributed Systems - III

Distributed Systems - III CSE 421/521 - Operating Systems Fall 2012 Lecture - XXIV Distributed Systems - III Tevfik Koşar University at Buffalo November 29th, 2012 1 Distributed File Systems Distributed file system (DFS) a distributed

More information

Introduction to the Network File System (NFS)

Introduction to the Network File System (NFS) Introduction to the Network File System (NFS) What was life like before NFS? Introduction to the Network File System (NFS) NFS is built on top of: UDP - User Datagram Protocol (unreliable delivery) Introduction

More information

AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi, Akshay Kanwar, Lovenish Saluja

AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi, Akshay Kanwar, Lovenish Saluja www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 2 Issue 10 October, 2013 Page No. 2958-2965 Abstract AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi,

More information

Network File System (NFS)

Network File System (NFS) Network File System (NFS) Nima Honarmand User A Typical Storage Stack (Linux) Kernel VFS (Virtual File System) ext4 btrfs fat32 nfs Page Cache Block Device Layer Network IO Scheduler Disk Driver Disk NFS

More information

Today: Distributed File Systems. File System Basics

Today: Distributed File Systems. File System Basics Today: Distributed File Systems Overview of stand-alone (UNIX) file systems Issues in distributed file systems Next two classes: case studies of distributed file systems NFS Coda xfs Log-structured file

More information

Status of the Linux NFS client

Status of the Linux NFS client Status of the Linux NFS client Introduction - aims of the Linux NFS client General description of the current status NFS meets the Linux VFS Peculiarities of the Linux VFS vs. requirements of NFS Linux

More information

Today: Distributed File Systems

Today: Distributed File Systems Today: Distributed File Systems Overview of stand-alone (UNIX) file systems Issues in distributed file systems Next two classes: case studies of distributed file systems NFS Coda xfs Log-structured file

More information

Network File Systems

Network File Systems Network File Systems CS 240: Computing Systems and Concurrency Lecture 4 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Abstraction, abstraction, abstraction!

More information

Module 17: Distributed-File Systems

Module 17: Distributed-File Systems Module 17: Distributed-File Systems Background Naming and Transparency Remote File Access Stateful versus Stateless Service File Replication Example Systems 17.1 Background Distributed file system (DFS)

More information

Week 12: File System Implementation

Week 12: File System Implementation Week 12: File System Implementation Sherif Khattab http://www.cs.pitt.edu/~skhattab/cs1550 (slides are from Silberschatz, Galvin and Gagne 2013) Outline File-System Structure File-System Implementation

More information

CSE 486/586: Distributed Systems

CSE 486/586: Distributed Systems CSE 486/586: Distributed Systems Distributed Filesystems Ethan Blanton Department of Computer Science and Engineering University at Buffalo Distributed Filesystems This lecture will explore network and

More information

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09 Distributed File Systems CS 537 Lecture 15 Distributed File Systems Michael Swift Goal: view a distributed system as a file system Storage is distributed Web tries to make world a collection of hyperlinked

More information

Introduction to the Network File System (NFS)

Introduction to the Network File System (NFS) Introduction to the Network File System (NFS) What was life like before NFS? Introduction to the Network File System (NFS) NFS is built on top of: UDP - User Datagram Protocol (unreliable delivery) XDR

More information

V. File System. SGG9: chapter 11. Files, directories, sharing FS layers, partitions, allocations, free space. TDIU11: Operating Systems

V. File System. SGG9: chapter 11. Files, directories, sharing FS layers, partitions, allocations, free space. TDIU11: Operating Systems V. File System SGG9: chapter 11 Files, directories, sharing FS layers, partitions, allocations, free space TDIU11: Operating Systems Ahmed Rezine, Linköping University Copyright Notice: The lecture notes

More information

Samba in a cross protocol environment

Samba in a cross protocol environment Mathias Dietz IBM Research and Development, Mainz Samba in a cross protocol environment aka SMB semantics vs NFS semantics Introduction Mathias Dietz (IBM) IBM Research and Development in Mainz, Germany

More information

Chapter 11: File-System Interface

Chapter 11: File-System Interface Chapter 11: File-System Interface Chapter 11: File-System Interface File Concept Access Methods Disk and Directory Structure File-System Mounting File Sharing Protection Objectives To explain the function

More information

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 18: Naming, Directories, and File Caching

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 18: Naming, Directories, and File Caching CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2004 Lecture 18: Naming, Directories, and File Caching 18.0 Main Points How do users name files? What is a name? Lookup:

More information

Chapter 17: Distributed Systems (DS)

Chapter 17: Distributed Systems (DS) Chapter 17: Distributed Systems (DS) Silberschatz, Galvin and Gagne 2013 Chapter 17: Distributed Systems Advantages of Distributed Systems Types of Network-Based Operating Systems Network Structure Communication

More information

Middleware. Adapted from Alonso, Casati, Kuno, Machiraju Web Services Springer 2004

Middleware. Adapted from Alonso, Casati, Kuno, Machiraju Web Services Springer 2004 Middleware Adapted from Alonso, Casati, Kuno, Machiraju Web Services Springer 2004 Outline Web Services Goals Where do they come from? Understanding middleware Middleware as infrastructure Communication

More information

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 18: Naming, Directories, and File Caching

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 18: Naming, Directories, and File Caching CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2002 Lecture 18: Naming, Directories, and File Caching 18.0 Main Points How do users name files? What is a name? Lookup:

More information

Network File System (NFS)

Network File System (NFS) Network File System (NFS) Brad Karp UCL Computer Science CS GZ03 / M030 14 th October 2015 NFS Is Relevant Original paper from 1985 Very successful, still widely used today Early result; much subsequent

More information

DATA STRUCTURES USING C

DATA STRUCTURES USING C DATA STRUCTURES USING C File Management Chapter 9 2 File Concept Contiguous logical address space Types: Data numeric character binary Program 3 File Attributes Name the only information kept in human-readable

More information

Network File System (NFS)

Network File System (NFS) Network File System (NFS) Brad Karp UCL Computer Science CS GZ03 / M030 19 th October, 2009 NFS Is Relevant Original paper from 1985 Very successful, still widely used today Early result; much subsequent

More information

Today: Distributed File Systems. Naming and Transparency

Today: Distributed File Systems. Naming and Transparency Last Class: Distributed Systems and RPCs Today: Distributed File Systems Servers export procedures for some set of clients to call To use the server, the client does a procedure call OS manages the communication

More information

MODELS OF DISTRIBUTED SYSTEMS

MODELS OF DISTRIBUTED SYSTEMS Distributed Systems Fö 2/3-1 Distributed Systems Fö 2/3-2 MODELS OF DISTRIBUTED SYSTEMS Basic Elements 1. Architectural Models 2. Interaction Models Resources in a distributed system are shared between

More information

SMD149 - Operating Systems - File systems

SMD149 - Operating Systems - File systems SMD149 - Operating Systems - File systems Roland Parviainen November 21, 2005 1 / 59 Outline Overview Files, directories Data integrity Transaction based file systems 2 / 59 Files Overview Named collection

More information

Lecture 19. NFS: Big Picture. File Lookup. File Positioning. Stateful Approach. Version 4. NFS March 4, 2005

Lecture 19. NFS: Big Picture. File Lookup. File Positioning. Stateful Approach. Version 4. NFS March 4, 2005 NFS: Big Picture Lecture 19 NFS March 4, 2005 File Lookup File Positioning client request root handle handle Hr lookup a in Hr handle Ha lookup b in Ha handle Hb lookup c in Hb handle Hc server time Server

More information

Network File System (NFS) Hard State Revisited: Network Filesystems. File Handles. NFS Vnodes. NFS as a Stateless Service

Network File System (NFS) Hard State Revisited: Network Filesystems. File Handles. NFS Vnodes. NFS as a Stateless Service Network File System () Hard State Revisited: Network Filesystems client user programs Jeff Chase CPS 212, Fall 2000 client RPC over UDP or TCP Vnodes File Handles The protocol has an operation type for

More information

Today: Distributed File Systems!

Today: Distributed File Systems! Last Class: Distributed Systems and RPCs! Servers export procedures for some set of clients to call To use the server, the client does a procedure call OS manages the communication Lecture 25, page 1 Today:

More information

Chapter 11: File-System Interface. Operating System Concepts 9 th Edition

Chapter 11: File-System Interface. Operating System Concepts 9 th Edition Chapter 11: File-System Interface Silberschatz, Galvin and Gagne 2013 Chapter 11: File-System Interface File Concept Access Methods Disk and Directory Structure File-System Mounting File Sharing Protection

More information

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) Dept. of Computer Science & Engineering Chentao Wu wuct@cs.sjtu.edu.cn Download lectures ftp://public.sjtu.edu.cn User:

More information

Operating Systems 2010/2011

Operating Systems 2010/2011 Operating Systems 2010/2011 File Systems part 2 (ch11, ch17) Shudong Chen 1 Recap Tasks, requirements for filesystems Two views: User view File type / attribute / access modes Directory structure OS designers

More information

CS454/654 Midterm Exam Fall 2004

CS454/654 Midterm Exam Fall 2004 CS454/654 Midterm Exam Fall 2004 (3 November 2004) Question 1: Distributed System Models (18 pts) (a) [4 pts] Explain two benefits of middleware to distributed system programmers, providing an example

More information

Chapter 12 Distributed File Systems. Copyright 2015 Prof. Amr El-Kadi

Chapter 12 Distributed File Systems. Copyright 2015 Prof. Amr El-Kadi Chapter 12 Distributed File Systems Copyright 2015 Prof. Amr El-Kadi Outline Introduction File Service Architecture Sun Network File System Recent Advances Copyright 2015 Prof. Amr El-Kadi 2 Introduction

More information

SMB. / / 80-. /,,,, /scalability/ mainframe. / . ",,!. # $ " fail sharing,,. % ,,. " 90-, 12, /.! database.! /DBMS/.

SMB. / / 80-. /,,,, /scalability/ mainframe. / . ,,!. # $  fail sharing,,. % ,,.  90-, 12, /.! database.! /DBMS/. / 1980 / 80- / /scalability/ mainframe /! "! # $ " fail sharing %! " 90-!! 12! /! database! /DBMS/ /!! RPC SQL "!/file sharing/!-!- "!! - / SMB SMB Server Message Block!! named pipes /& ! / mailslots /

More information

Today: Distributed File Systems

Today: Distributed File Systems Last Class: Distributed Systems and RPCs Servers export procedures for some set of clients to call To use the server, the client does a procedure call OS manages the communication Lecture 22, page 1 Today:

More information

Distributed File System

Distributed File System Distributed File System Project Report Surabhi Ghaisas (07305005) Rakhi Agrawal (07305024) Election Reddy (07305054) Mugdha Bapat (07305916) Mahendra Chavan(08305043) Mathew Kuriakose (08305062) 1 Introduction

More information

NFS Version 4.1. Spencer Shepler, Storspeed Mike Eisler, NetApp Dave Noveck, NetApp

NFS Version 4.1. Spencer Shepler, Storspeed Mike Eisler, NetApp Dave Noveck, NetApp NFS Version 4.1 Spencer Shepler, Storspeed Mike Eisler, NetApp Dave Noveck, NetApp Contents Comparison of NFSv3 and NFSv4.0 NFSv4.1 Fixes and Improvements ACLs Delegation Management Opens Asynchronous

More information

Remote Procedure Call (RPC) and Transparency

Remote Procedure Call (RPC) and Transparency Remote Procedure Call (RPC) and Transparency Brad Karp UCL Computer Science CS GZ03 / M030 10 th October 2014 Transparency in Distributed Systems Programmers accustomed to writing code for a single box

More information

DFS Case Studies, Part 2. The Andrew File System (from CMU)

DFS Case Studies, Part 2. The Andrew File System (from CMU) DFS Case Studies, Part 2 The Andrew File System (from CMU) Case Study Andrew File System Designed to support information sharing on a large scale by minimizing client server communications Makes heavy

More information

Lecture 10 File Systems - Interface (chapter 10)

Lecture 10 File Systems - Interface (chapter 10) Bilkent University Department of Computer Engineering CS342 Operating Systems Lecture 10 File Systems - Interface (chapter 10) Dr. İbrahim Körpeoğlu http://www.cs.bilkent.edu.tr/~korpe 1 References The

More information

NFS Version 4 17/06/05. Thimo Langbehn

NFS Version 4 17/06/05. Thimo Langbehn NFS Version 4 17/06/05 Thimo Langbehn Operating System Services and Administration Seminar 2005 Hasso-Plattner-Institute for Software Systems Engineering thimo.langbehn@student.hpi.uni-potsdam.de Abstract

More information

Lecture 7: Distributed File Systems

Lecture 7: Distributed File Systems 06-06798 Distributed Systems Lecture 7: Distributed File Systems 5 February, 2002 1 Overview Requirements for distributed file systems transparency, performance, fault-tolerance,... Design issues possible

More information

File Systems: Interface and Implementation

File Systems: Interface and Implementation File Systems: Interface and Implementation CSCI 315 Operating Systems Design Department of Computer Science File System Topics File Concept Access Methods Directory Structure File System Mounting File

More information

File Systems: Interface and Implementation

File Systems: Interface and Implementation File Systems: Interface and Implementation CSCI 315 Operating Systems Design Department of Computer Science Notice: The slides for this lecture have been largely based on those from an earlier edition

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

Modulo V Sistema de Arquivos

Modulo V Sistema de Arquivos April 05 Prof. Ismael H. F. Santos - ismael@tecgraf.puc-rio.br 1 Modulo V Sistema de Arquivos Prof. Ismael H F Santos Ementa File-System Interface File Concept Directory Structure File Sharing Protection

More information

Operating Systems. Operating Systems Professor Sina Meraji U of T

Operating Systems. Operating Systems Professor Sina Meraji U of T Operating Systems Operating Systems Professor Sina Meraji U of T How are file systems implemented? File system implementation Files and directories live on secondary storage Anything outside of primary

More information

File systems: management 1

File systems: management 1 File systems: management 1 Disk quotas for users Quotas for keeping track of each user s disk use Soft limit and hard limit 2 Backup 3 File System Backup Replacing hardware is easy, but not the data Backups

More information

An NFS Replication Hierarchy

An NFS Replication Hierarchy W3C Push Technologies Workshop An NFS Replication Hierarchy Slide 1 of 18 An NFS Replication Hierarchy Sun Microsystems, Inc W3C Push Technologies Workshop An NFS Replication Hierarchy Slide 2 of 18 The

More information