Distributed File Systems

Similar documents
Distributed File Systems

Introduction. Distributed file system. File system modules. UNIX file system operations. File attribute record structure

Chapter 8 Distributed File Systems

Chapter 8: Distributed File Systems. Introduction File Service Architecture Sun Network File System The Andrew File System Recent advances Summary

Introduction. Chapter 8: Distributed File Systems

DFS Case Studies, Part 1

Distributed File Systems. File Systems

Chapter 12 Distributed File Systems. Copyright 2015 Prof. Amr El-Kadi

UNIT V Distributed File System

Lecture 7: Distributed File Systems

Module 7 File Systems & Replication CS755! 7-1!

DFS Case Studies, Part 2. The Andrew File System (from CMU)

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition,

Lecture 14: Distributed File Systems. Contents. Basic File Service Architecture. CDK: Chapter 8 TVS: Chapter 11

Ch. 7 Distributed File Systems

Distributed Systems Distributed Objects and Distributed File Systems

Distributed File Systems. CS432: Distributed Systems Spring 2017

Course Objectives. Course Outline. Course Outline. Distributed Systems High-Performance Networks Clusters and Computational Grids

Distributed File Systems. Distributed Computing Systems. Outline. Concepts of Distributed File System. Concurrent Updates. Transparency 2/12/2016

Distributed File Systems I

SOFTWARE ARCHITECTURE 11. DISTRIBUTED FILE SYSTEM.

Distributed Systems. Hajussüsteemid MTAT Distributed File Systems. (slides: adopted from Meelis Roos DS12 course) 1/25

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Nov 28, 2017 Lecture 25: Distributed File Systems All slides IG

NFS: Naming indirection, abstraction. Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency


DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3.

DISTRIBUTED FILE SYSTEMS & NFS

Distributed file systems

UNIT III. cache/replicas maintenance Main memory No No No 1 RAM File system n No Yes No 1 UNIX file system Distributed file

Network File Systems

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

COS 318: Operating Systems. Journaling, NFS and WAFL

Distributed File Systems. Case Studies: Sprite Coda

Distributed File Systems

Filesystems Lecture 11

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09

Operating Systems Design 16. Networking: Remote File Systems

Today s Objec2ves. AWS/MR Review Final Projects Distributed File Systems. Nov 3, 2017 Sprenkle - CSCI325

Lecture 19. NFS: Big Picture. File Lookup. File Positioning. Stateful Approach. Version 4. NFS March 4, 2005

Filesystems Lecture 13

Introduction to Distributed Computing

Chapter 2 System Models

Today: Distributed File Systems. File System Basics

Today: Distributed File Systems

CSE 486/586: Distributed Systems

C 1. Recap. CSE 486/586 Distributed Systems Distributed File Systems. Traditional Distributed File Systems. Local File Systems.

Background. 20: Distributed File Systems. DFS Structure. Naming and Transparency. Naming Structures. Naming Schemes Three Main Approaches

NFS Design Goals. Network File System - NFS

File Management. Information Structure 11/5/2013. Why Programmers Need Files

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

Chapter 7: File-System

CS454/654 Midterm Exam Fall 2004

File-System Structure

Today: Coda, xfs! Brief overview of other file systems. Distributed File System Requirements!

Chapter 18 Distributed Systems and Web Services

Today: Distributed File Systems

Distributed File Systems

Chapter 11: File System Implementation

Today: Distributed File Systems. Naming and Transparency

Chapter 11: File System Implementation

Network File System (NFS)

Network File System (NFS)

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition,

Module 17: Distributed-File Systems

Today: Distributed File Systems!

Chapter 11: Implementing File Systems

Chapter 11: Implementing File-Systems

Distributed Systems. Distributed File Systems. Paul Krzyzanowski

PEER TO PEER SERVICES AND FILE SYSTEM

File-System Interface

UNIX File System. UNIX File System. The UNIX file system has a hierarchical tree structure with the top in root.

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

Chapter 11: File-System Interface

Chapter 11 DISTRIBUTED FILE SYSTEMS

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

What is a file system

Distributed Objects and Remote Invocation. Programming Models for Distributed Applications

Advanced Operating Systems

SMD149 - Operating Systems - File systems

Distributed File Systems. Jonathan Walpole CSE515 Distributed Computing Systems

Dr. Robert N. M. Watson

416 Distributed Systems. Distributed File Systems 1: NFS Sep 18, 2018

Module 17: Distributed-File Systems

Distributed File Systems. Distributed Systems IT332

INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad

Distributed Systems 8L for Part IB. Additional Material (Case Studies) Dr. Steven Hand

Last Class: Memory management. Per-process Replacement

Operating Systems 2010/2011

Chapter 1: Distributed Systems: What is a distributed system? Fall 2013

File Systems. What do we need to know?

Chapter 11: Implementing File Systems

Current Topics in OS Research. So, what s hot?

System Models. 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models. Nicola Dragoni Embedded Systems Engineering DTU Informatics

Operating Systems Design Exam 3 Review: Spring Paul Krzyzanowski

Chapter 12 File-System Implementation

Chapter 11: File-System Interface

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space

3/4/14. Review of Last Lecture Distributed Systems. Topic 2: File Access Consistency. Today's Lecture. Session Semantics in AFS v2

Introduction to the Network File System (NFS)

Transcription:

Distributed File Systems Distributed Systems Introduction File service architecture Sun Network File System (NFS) Andrew File System (AFS) Recent advances Summary Learning objectives Understand the requirements that affect the design of distributed services NFS: understand how a relatively simple, widely-used service is designed o Obtain a knowledge of file systems, both local and networked o Caching as an essential design technique o Remote interfaces are not the same as APIs o Security requires special consideration Recent advances: appreciate the ongoing research that often leads to major advances Distributed Systems DoCS 2002 2

Revision: Failure model Class of failure Affects Description Fail-stop Process Process halts and remains halted. Other processes may detect this state. Crash Process Process halts and remains halted. Other processes may not be able to detect this state. Omission Channel A message inserted in an outgoing message buffer never arrives at the other end s incoming message buffer. Send-omission Process A process completes a send but the message is not put in its outgoing message buffer. Receive-omissionProcess A message is put in a process s incoming message buffer, but that process does not receive it. Arbitrary (Byzantine) Process Process/channel exhibits arbitrary behaviour: it may or channel send/transmit arbitrary messages at arbitrary times, commit omissions; a process may stop or take an incorrect step. Distributed Systems DoCS 2002 3 Storage systems and their properties In first generation of distributed systems (1974-95), file systems (e.g. NFS) were the only networked storage systems. With the advent of distributed object systems (CORBA, Java) and the web, the picture has become more complex. Distributed Systems DoCS 2002 4

Storage systems and their properties Sharing Persistence Types of consistency between copies: 1 - strict one-copy consistency - approximate consistency X - no automatic consistency Distributed Consistency Example cache/replicas maintenance Main memory File system Distributed file system Web Distributed shared memory 1 1 RAM UNIX file system Sun NFS Web server Ivy Remote objects (RMI/ORB) 1 CORBA Persistent object store 1 CORBA Persistent Object Service Persistent distributed object store PerDiS, Khazana Distributed Systems DoCS 2002 5 What is a file system? Persistent stored data sets Hierarchic name space visible to all processes API with the following characteristics: o access and update operations on persistently stored data sets o Sequential access model (with additional random facilities) Sharing of data between users, with access control Concurrent access: o certainly for read-only access o what about updates? Other features: o mountable file stores o more?... Distributed Systems DoCS 2002 6

What is a file system? UNIX file system operations filedes = open(name, mode) filedes = creat(name, mode) status = close(filedes) count = read(filedes, buffer, n) count = write(filedes, buffer, n) pos = lseek(filedes, offset, whence) status = unlink(name) status = link(name1, name2) status = stat(name, buffer) Opens an existing file with the given name. Creates a new file with the given name. Both operations deliver a file descriptor referencing the open file. The mode is read, write or both. Closes the open file filedes. Transfers n bytes from the file referenced by filedes to buffer. Transfers n bytes to the file referenced by filedes from buffer. Both operations deliver the number of bytes actually transferred and advance the read-write pointer. Moves the read-write pointer to offset (relative or absolute, depending on whence). Removes the file name from the directory structure. If the file has no other names, it is deleted. Adds a new name (name2) for a file (name1). Gets the file attributes for file name into buffer. Distributed Systems DoCS 2002 7 What is a file system? File system modules Directory module: File module: Access control module: File access module: Block module: Device module: relates file names to file IDs relates file IDs to particular files checks permission for operation requested reads or writes file data or attributes accesses and allocates disk blocks disk I/O and buffering Distributed Systems DoCS 2002 8

What is a file system? File attribute record structure updated by system: updated by owner: File length Creation timestamp Read timestamp Write timestamp Attribute timestamp Reference count Owner File type Access control list E.g. for UNIX: rw-rw-r-- Distributed Systems DoCS 2002 9 File service requirements Transparency Concurrency Replication Heterogeneity Fault tolerance Consistency Security Efficiency.. Distributed Systems DoCS 2002 10

File service requirements Transparency Concurrency Replication Heterogeneity Fault tolerance Consistency Security Efficiency.. Transparencies Access: Same operations Location: Same name space after relocation of files or processes Mobility: Automatic relocation of files is possible Performance: Satisfactory performance across a specified range of system loads Scaling: Service can be expanded to meet additional loads Distributed Systems DoCS 2002 11 File service requirements Transparency Concurrency Replication Heterogeneity Fault tolerance Consistency Security Efficiency.. Concurrency properties Isolation File-level or record-level locking Other forms of concurrency control to minimise contention Distributed Systems DoCS 2002 12

File service requirements Transparency Concurrency Replication Heterogeneity Fault tolerance Consistency Security Efficiency.. Replication properties File service maintains multiple identical copies of files Load-sharing between servers makes service more scalable Local access has better response (lower latency) Fault tolerance Full replication is difficult to implement. Caching (of all or part of a file) gives most of the benefits (except fault tolerance) Distributed Systems DoCS 2002 13 File service requirements Transparency Concurrency Replication Heterogeneity Fault tolerance Consistency Security Efficiency.. Heterogeneity properties Service can be accessed by clients running on (almost) any OS or hardware platform. Design must be compatible with the file systems of different OSes Service interfaces must be open -precise specifications of APIs are published. Distributed Systems DoCS 2002 14

File service requirements Transparency Concurrency Replication Heterogeneity Fault tolerance Consistency Security Efficiency.. Fault tolerance Service must continue to operate even when clients make errors or crash. at-most-once semantics at-least-once semantics requires idempotent operations Service must resume after a server machine crashes. If the service is replicated, it can continue to operate even during a server crash. Distributed Systems DoCS 2002 15 File service requirements Transparency Concurrency Replication Heterogeneity Fault tolerance Consistency Security Efficiency.. Consistency Unix offers one-copy update semantics for operations on local files - caching is completely transparent. Difficult to achieve the same for distributed file systems while maintaining good performance and scalability. Distributed Systems DoCS 2002 16

File service requirements Transparency Concurrency Replication Heterogeneity Fault tolerance Consistency Security Efficiency.. Security Must maintain access control and privacy as for local files. based on identity of user making request identities of remote users must be authenticated privacy requires secure communication Service interfaces are open to all processes not excluded by a firewall. vulnerable to impersonation and other attacks Distributed Systems DoCS 2002 17 File service requirements Transparency Concurrency Replication Heterogeneity Fault tolerance Consistency Security Efficiency.. Efficiency Goal for distributed file systems is usually performance comparable to local file system. Distributed Systems DoCS 2002 18 *

Model file service architecture Application program Client computer Application program Lookup AddName UnName GetNames Server computer Directory service Client module Flat file service Read Write Create Delete GetAttributes SetAttributes Distributed Systems DoCS 2002 19 Server operations for the model file service Flat file service Read(FileId, i, n) -> Data position of first byte Write(FileId, i, Data) position of first byte Create() -> FileId Delete(FileId) GetAttributes(FileId) -> Attr SetAttributes(FileId, Attr) Directory service Lookup(Dir, Name) -> FileId AddName(Dir, Name, FileId) UnName(Dir, Name) GetNames(Dir, Pattern) -> NameSeq FileId A unique identifier for files anywhere in the network. Pathname lookup Pathnames such as '/usr/bin/tar' are resolved by iterative calls to lookup(), one call for each component of the path, starting with the ID of the root directory '/' which is known in every client. Distributed Systems DoCS 2002 20

File Group A collection of files that can be located on any server or moved between servers while maintaining the same names. o Similar to a UNIX filesystem o Helps with distributing the load of file serving between several servers. o File groups have identifiers which are unique throughout the system (and hence for an open system, they must be globally unique). Used to refer to file groups and files To construct a globally unique ID we use some unique attribute of the machine on which it is created, e.g. IP number, even though the file group may move subsequently. File Group ID: 32 bits 16 bits IP address date Distributed Systems DoCS 2002 21 Case Study: Sun NFS An industry standard for file sharing on local networks since the 1980s An open standard with clear and simple interfaces Closely follows the abstract file service model defined above Supports many of the design requirements already mentioned: o transparency o heterogeneity o efficiency o fault tolerance Limited achievement of: o concurrency o replication o consistency o security Distributed Systems DoCS 2002 22

NFS architecture Client computer Client computer NFS Application program Client Server computer UNIX system calls Application program Application program Kernel Application program UNIX kernel Operations on local files Virtual file system UNIX file system Other file system NFS client Operations on remote files Virtual file system NFS server NFS Client UNIX file system NFS protocol (remote operations) Distributed Systems DoCS 2002 23 NFS architecture: does the implementation have to be in the system kernel? No: o there are examples of NFS clients and servers that run at application-level as libraries or processes (e.g. early Windows and MacOS implementations, current PocketPC, etc.) But, for a Unix implementation there are advantages: o Binary code compatible - no need to recompile applications Standard system calls that access remote files can be routed through the NFS client module by the kernel o Shared cache of recently-used blocks at client o Kernel-level server can access i-nodes and file blocks directly but a privileged (root) application program could do almost the same. o Security of the encryption key used for authentication. Distributed Systems DoCS 2002 24

NFS server operations (simplified) read(fh, offset, count) -> attr, data write(fh, offset, count, data) -> attr create(dirfh, name, attr) -> newfh, attr remove(dirfh, name) status getattr(fh) -> attr setattr(fh, attr) -> attr lookup(dirfh, name) -> fh, attr rename(dirfh, name, todirfh, toname) link(newdirfh, newname, dirfh, name) readdir(dirfh, cookie, count) -> entries symlink(newdirfh, newname, string) -> status readlink(fh) -> string mkdir(dirfh, name, attr) -> newfh, attr rmdir(dirfh, name) -> status statfs(fh) -> fsstats Model flat file service Read(FileId, i, n) -> Data Write(FileId, i, Data) Create() -> FileId Delete(FileId) GetAttributes(FileId) -> Attr SetAttributes(FileId, Attr) Model directory service Lookup(Dir, Name) -> FileId AddName(Dir, Name, File) UnName(Dir, Name) GetNames(Dir, Pattern) ->NameSeq Distributed Systems DoCS 2002 25 NFS access control and authentication Stateless server, so the user's identity and access rights must be checked by the server on each request. o In the local file system they are checked only on open() Every client request is accompanied by the userid and groupid o not shown in the because they are inserted by the RPC system Server is exposed to imposter attacks unless the userid and groupid are protected by encryption Kerberos has been integrated with NFS to provide a stronger and more comprehensive security solution Distributed Systems DoCS 2002 26

Mount service Mount operation: mount(remotehost, remotedirectory, localdirectory) Server maintains a table of clients who have mounted filesystems at that server Each client maintains a table of mounted file systems holding: < IP address, port number, file handle> Hard versus soft mounts Distributed Systems DoCS 2002 27 Local and remote file systems accessible on an NFS client Server 1 (root) Client Server 2 (root) (root) export... vmunix usr nfs people Remote mount students x staff Remote mount users big jon bob... jim ann jane joe Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1; the file system mounted at /usr/staff in the client is actually the sub-tree located at /nfs/users in Server 2. Distributed Systems DoCS 2002 28

Automounter NFS client catches attempts to access 'empty' mount points and routes them to the Automounter o Automounter has a table of mount points and multiple candidate serves for each o it sends a probe message to each candidate server and then uses the mount service to mount the filesystem at the first server to respond Keeps the mount table small Provides a simple form of replication for read-only filesystems o E.g. if there are several servers with identical copies of /usr/lib then each server will have a chance of being mounted at some clients. Distributed Systems DoCS 2002 29 Kerberized NFS Kerberos protocol is too costly to apply on each file access request Kerberos is used in the mount service: o to authenticate the user's identity o User's UserID and GroupID are stored at the server with the client's IP address For each file request: o The UserID and GroupID sent must match those stored at the server o IP addresses must also match This approach has some problems o can't accommodate multiple users sharing the same client computer o all remote filestores must be mounted each time a user logs in Distributed Systems DoCS 2002 30

NFS optimization - server caching Similar to UNIX file caching for local files: o pages (blocks) from disk are held in a main memory buffer cache until the space is required for newer pages. Read-ahead and delayed-write optimizations. o For local files, writes are deferred to next sync event (30 second intervals) o Works well in local context, where files are always accessed through the local cache, but in the remote case it doesn't offer necessary synchronization guarantees to clients. NFS v3 servers offers two strategies for updating the disk: o write-through - altered pages are written to disk as soon as they are received at the server. When a write() RPC returns, the NFS client knows that the page is on the disk. o delayed commit - pages are held only in the cache until a commit() call is received for the relevant file. This is the default mode used by NFS v3 clients. A commit() is issued by the client whenever a file is closed. Distributed Systems DoCS 2002 31 NFS optimization - client caching Server caching does nothing to reduce RPC traffic between client and server o further optimization is essential to reduce server load in large networks o NFS client module caches the results of read, write, getattr, lookup and readdir operations o synchronization of file contents (one-copy semantics) is not guaranteed when two or more clients are sharing the same file. Timestamp-based validity check o reduces inconsistency, but doesn't eliminate it o validity condition for cache entries at the client: (T - Tc < t) v (Tm client = Tm server ) t freshness guarantee o t is configurable (per file) but is typically set to 3 seconds for files and 30 secs. Tc time when cache entry was last validated for directories Tm time when block was last o it remains difficult to write distributed updated at server applications that share files with NFS T current time Distributed Systems DoCS 2002 32

Other NFS optimizations Sun RPC runs over UDP by default (can use TCP if required) Uses UNIX BSD Fast File System with 8-kbyte blocks reads() and writes() can be of any size (negotiated between client and server) the guaranteed freshness interval t is set adaptively for individual files to reduce gettattr() calls needed to update Tm file attribute information (including Tm) is piggybacked in replies to all file requests Distributed Systems DoCS 2002 33 NFS performance Early measurements (1987) established that: o o write() operations are responsible for only 5% of server calls in typical UNIX environments hence write-through at server is acceptable lookup() accounts for 50% of operations -due to step-by-step pathname resolution necessitated by the naming and mounting semantics. More recent measurements (1993) show high performance: 1 x 450 MHz Pentium III: > 5000 server ops/sec, < 4 ms average latency 24 x 450 MHz IBM RS64: > 29,000 server ops/sec, < 4 ms average latency see www.spec.org for more recent measurements Provides a good solution for many environments including: o o large networks of UNIX and PC clients multiple web server installations sharing a single file store Distributed Systems DoCS 2002 34

NFS summary An excellent example of a simple, robust, high-performance distributed service. Achievement of transparencies: Access: Excellent; the API is the UNIX system call interface for both local and remote files. Location: Not guaranteed but normally achieved; naming of filesystems is controlled by client mount operations, but transparency can be ensured by an appropriate system configuration. Concurrency: Limited but adequate for most purposes; when read-write files are shared concurrently between clients, consistency is not perfect. Replication: Limited to read-only file systems; for writable files, the SUN Network Information Service (NIS) runs over NFS and is used to replicate essential system files, Distributed Systems DoCS 2002 35 NFS summary Achievement of transparencies (continued): Failure: Limited but effective; service is suspended if a server fails. Recovery from failures is aided by the simple stateless design. Mobility: Hardly achieved; relocation of files is not possible, relocation of filesystems is possible, but requires updates to client configurations. Performance: Good; multiprocessor servers achieve very high performance, but for a single filesystem it's not possible to go beyond the throughput of a multiprocessor server. Scaling: Good; filesystems (file groups) may be subdivided and allocated to separate servers. Ultimately, the performance limit is determined by the load on the server holding the most heavily-used filesystem (file group). Distributed Systems DoCS 2002 36

Distribution of processes in the Andrew File System Workstations User Venus program UNIX kernel Servers Vice UNIX kernel Venus User program UNIX kernel Network Vice Venus User program UNIX kernel UNIX kernel Distributed Systems DoCS 2002 37 File name space seen by clients of AFS Local / (root) Shared tmp bin... vmunix cmu bin Symbolic links Distributed Systems DoCS 2002 38

System call interception in AFS Workstation User program UNIX file system calls Non-local file operations Venus UNIX kernel UNIX file system Local disk Distributed Systems DoCS 2002 39 Implementation of filesystem calls in AFS User process UNIX kernel Venus Net Vice open(filename, mode) read(filedescriptor, Buffer, length) If FileNamerefers to a file in shared file space, pass the request to Check list of files in local cache. If not Venus. present or there is no validcallback promise, send a request for the file to the Vice server that is custodian of the volume containing the file. Open the local file and return the file descriptor to the application. Perform a normal UNIX read operation on the local copy. write(filedescriptor, Perform a normal Buffer, length) UNIX write operation on the local copy. Place the copy of the file in the local file system, enter its local name in the local cache list and return the local name to UNIX. Transfer a copy of the file and acallback promiseto the workstation. Log the callback promise. close(filedescriptor) Close the local copy and notify Venus that the file has been closed. If the local copy has been changed, send a copy to the Vice server Replace the file that is the custodian of contents and send a the file. callback to all other clients holdingcallback promiseson the file. Distributed Systems DoCS 2002 40

The main components of the Vice service interface Fetch(fid) -> attr, data Returns the attributes (status) and, optionally, the contents of file identified by the fid and records a callback promise on it. Store(fid, attr, data) Create() -> fid Remove(fid) SetLock(fid, mode) ReleaseLock(fid) RemoveCallback(fid) BreakCallback(fid) Updates the attributes and (optionally) the contents of a specified file. Creates a new file and records a callback promise on it. Deletes the specified file. Sets a lock on the specified file or directory. The mode of the lock may be shared or exclusive. Locks that are not removed expire after 30 minutes. Unlocks the specified file or directory. Informs server that a Venus process has flushed a file from its cache. This call is made by a Vice server to a Venus process. It cancels the callback promise on the relevant file. Distributed Systems DoCS 2002 41 Recent advances in file services NFS enhancements WebNFS - NFS server implements a web-like service on a well-known port. Requests use a 'public file handle' and a pathname-capable variant of lookup(). Enables applications to access NFS servers directly, e.g. to read a portion of a large file. One-copy update semantics (Spritely NFS, NQNFS) - Include an open() operation and maintain tables of open files at servers, which are used to prevent multiple writers and to generate callbacks to clients notifying them of updates. Performance was improved by reduction in gettattr() traffic. Improvements in disk storage organisation RAID - improves performance and reliability by striping data redundantly across several disk drives Log-structured file storage - updated pages are stored contiguously in memory and committed to disk in large contiguous blocks (~ 1 Mbyte). File maps are modified whenever an update occurs. Garbage collection to recover disk space. Distributed Systems DoCS 2002 42

New design approaches Distribute file data across several servers o Exploits high-speed networks (ATM, Gigabit Ethernet) o Layered approach, lowest level is like a 'distributed virtual disk' o Achieves scalability even for a single heavily-used file 'Serverless' architecture o Exploits processing and disk resources in all available network nodes o Service is distributed at the level of individual files Examples: xfs: Experimental implementation demonstrated a substantial performance gain over NFS and AFS Frangipani: Performance similar to local UNIX file access Tiger Video File System Peer-to-peer systems: Napster, OceanStore (UCB), Farsite (MSR), Publius (AT&T research) - see web for documentation on these very recent systems Distributed Systems DoCS 2002 43 New design approaches Replicated read-write files o High availability o Disconnected working re-integration after disconnection is a major problem if conflicting updates have occurred o Examples: Bayou system Coda system Distributed Systems DoCS 2002 44

Summary Sun NFS is an excellent example of a distributed service designed to meet many important design requirements Effective client caching can produce file service performance equal to or better than local file systems Consistency versus update semantics versus fault tolerance remains an issue Most client and server failures can be masked Superior scalability can be achieved with whole-file serving (Andrew FS) or the distributed virtual disk approach Future requirements: support for mobile users, disconnected operation, automatic reintegration support for data streaming and quality of service (Cf. Tiger file system,) Distributed Systems DoCS 2002 45