Today: Distributed File Systems!

Similar documents
Today: Distributed File Systems. Naming and Transparency

Today: Distributed File Systems

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09

Distributed Systems - III

Background. 20: Distributed File Systems. DFS Structure. Naming and Transparency. Naming Structures. Naming Schemes Three Main Approaches

DISTRIBUTED FILE SYSTEMS & NFS

NFS: Naming indirection, abstraction. Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency

Lecture 7: Distributed File Systems

Distributed File Systems

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition,

Advanced Operating Systems

Module 17: Distributed-File Systems

Chapter 11: Implementing File-Systems

Chapter 11: File System Implementation

Chapter 11: File System Implementation

Chapter 11: Implementing File Systems

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

Module 17: Distributed-File Systems

Lecture 14: Distributed File Systems. Contents. Basic File Service Architecture. CDK: Chapter 8 TVS: Chapter 11

416 Distributed Systems. Distributed File Systems 2 Jan 20, 2016

Chapter 17: Distributed Systems (DS)

Distributed File Systems. Directory Hierarchy. Transfer Model

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition,

Chapter 11: Implementing File Systems

ELEC 377 Operating Systems. Week 9 Class 3

Lecture 16/17: Distributed Shared Memory. CSC 469H1F Fall 2006 Angela Demke Brown

Chapter 12: File System Implementation

OPERATING SYSTEM. Chapter 12: File System Implementation

File-System Structure

Chapter 12: File System Implementation. Operating System Concepts 9 th Edition

Chapter 18 Distributed Systems and Web Services

Distributed Systems. Lec 9: Distributed File Systems NFS, AFS. Slide acks: Dave Andersen

Chapter 12: File System Implementation

Chapter 11: Implementing File Systems

Chapter 12 File-System Implementation

CS307: Operating Systems

Chapter 10: File System Implementation

OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD.

Network File Systems

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3.

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Nov 28, 2017 Lecture 25: Distributed File Systems All slides IG

Bhaavyaa Kapoor Person #

Distributed File Systems. Distributed Systems IT332

Chapter 11: Implementing File

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

Lecture 19. NFS: Big Picture. File Lookup. File Positioning. Stateful Approach. Version 4. NFS March 4, 2005

Distributed File Systems. Case Studies: Sprite Coda

CSE 486/586: Distributed Systems

Week 12: File System Implementation

Today: Distributed File Systems. File System Basics

Today: Distributed File Systems

Distributed Systems. Distributed File Systems. Paul Krzyzanowski

Distributed File Systems. CS432: Distributed Systems Spring 2017

Lecture 15: Network File Systems

Operating Systems Design 16. Networking: Remote File Systems

Operating Systems 2010/2011

Cloud Computing CS

Service and Cloud Computing Lecture 10: DFS2 Prof. George Baciu PQ838

Distributed File Systems I

AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi, Akshay Kanwar, Lovenish Saluja

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

Distributed File Systems Part II. Distributed File System Implementation

C 1. Recap. CSE 486/586 Distributed Systems Distributed File Systems. Traditional Distributed File Systems. Local File Systems.

Distributed file systems

Filesystems Lecture 13

Filesystems Lecture 11

CA464 Distributed Programming

Operating Systems CMPSC 473 File System Implementation April 10, Lecture 21 Instructor: Trent Jaeger

Today: Coda, xfs! Brief overview of other file systems. Distributed File System Requirements!

3/4/14. Review of Last Lecture Distributed Systems. Topic 2: File Access Consistency. Today's Lecture. Session Semantics in AFS v2

Computer Organization. Chapter 16

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Today: Protection! Protection!

Distributed Operating Systems Spring Prashant Shenoy UMass Computer Science.

Distributed Systems. Thoai Nam Faculty of Computer Science and Engineering HCMC University of Technology

Chapter 20: Database System Architectures

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1

Introduction. Chapter 8: Distributed File Systems

06-Dec-17. Credits:4. Notes by Pritee Parwekar,ANITS 06-Dec-17 1

Distributed Systems. Overview. Distributed Systems September A distributed system is a piece of software that ensures that:

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

Cloud Computing CS

Frequently asked questions from the previous class survey

Lecture 23 Database System Architectures

Organisasi Sistem Komputer

CSCI 4717 Computer Architecture

Network File System (NFS)

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems!

Distributed Systems. Hajussüsteemid MTAT Distributed File Systems. (slides: adopted from Meelis Roos DS12 course) 1/25

416 Distributed Systems. Distributed File Systems 1: NFS Sep 18, 2018

EECS 482 Introduction to Operating Systems

Announcements. Reading: Chapter 16 Project #5 Due on Friday at 6:00 PM. CMSC 412 S10 (lect 24) copyright Jeffrey K.

CS4513 Distributed Computer Systems

Introduction. Distributed file system. File system modules. UNIX file system operations. File attribute record structure

File-System Interface

Distributed Operating Systems Fall Prashant Shenoy UMass Computer Science. CS677: Distributed OS

Distributed Systems Principles and Paradigms. Chapter 01: Introduction

CSE380 - Operating Systems

Chapter 12 Distributed File Systems. Copyright 2015 Prof. Amr El-Kadi

Distributed Systems Principles and Paradigms. Chapter 01: Introduction. Contents. Distributed System: Definition.

Transcription:

Last Class: Distributed Systems and RPCs! Servers export procedures for some set of clients to call To use the server, the client does a procedure call OS manages the communication Lecture 25, page 1 Today: Distributed File Systems! One of the most common uses of distributed systems Basic idea: Given a set of disks attached to different nodes. share disks between nodes as if all the disks were attached to every node. Examples: Edlab: One server node with all the disks, and a bunch of diskless workstations on a LAN. AppleShare: Every node is both a server with a disk and a client. Lecture 25, page 2

Distributed File Systems: Issues! Naming and Transparency Remote file access Caching Server with state or without Replication Lecture 25, page 3 Naming and Transparency! Issues How are files named? Do file names reveal their location? Do file names change if the file moves? Do file names change if the user moves? Location transparency: the name of the file does not reveal the physical storage location. Location independence: The name of the file need not change if the file's storage location changes. Most naming schemes used in practice do not have location independence, but many have location transparency. Lecture 25, page 4

Naming Strategies: Absolute Names! Absolute names: <machine name: path name> Examples: AppleShare, Win NT Advantages: Finding a fully specified file name is simple. It is easy to add and delete new names. No global state. Scales easily. Disadvantages: User must know the complete name and is aware of which files are local and which are remote. File is location dependent, and thus cannot move. Makes sharing harder. Not fault tolerant. Lecture 25, page 5 Naming Strategies: Mount Points! Mount Points (NFS - Sun's Network File System) Each host has a set of local names for remote locations. Each host has a mount table (/etc/fstab) that specifies <remote path name @ machine name> and a <local path name>. At boot time, the local name is bound to the remote name. Users then refer to the <local path name> as if it were local, and the NFS takes care of the mapping Advantages: location transparent, remote name can change across reboots Disadvantages: single unified strategy hard to maintain, same file can have different names Lecture 25, page 6

NFS: Example! Partial contents of /etc/fstab for Edlab machines: /usr1/mail@elux3.cs.umass.edu:/var/spool/mail /users/users1@elsrv1:/users/users1 /users/users2@elsrv1:/users/users2 /users/users3@elsrv2:/users/users3 /users/users4@elsrv2:/users/users4 /courses/cs300@elsrv3:/courses/cs300 /rcf/mipsel/4.2/share@elsrv1:/exp/rcf/share /rcf/common@elsrv1:/exp/rcf/common Lecture 25, page 7 NFS: Example! Lecture 25, page 8

Naming Strategies: Global Name Space! Single name space: CMU's Andrew and Berkeley's Sprite No matter which node you are on, the file names are the same. Set of workstation clients, and a set of dedicated file server machines. When a client starts up, it gets its file name structure from a server. As users access files, the server sends copies to the workstation and the workstation caches the files Lecture 25, page 9 Advantages: Global Name Space! Naming is consistent and easy to keep consistent. The global name space insures all the files are the same regardless of where you login. Since names are bound late, moving them is easier. Disadvantages: It is difficult for the OS to keep file contents consistent due to caching. Global name space may limit flexibility. Performance problems. Lecture 25, page 10

Remote File Access and Caching! Once the user specifies a remote file, the OS can do the access either 1. remotely, on the server machine and then return the results using RPC (called remote service), or 2. can transfer the file (or part of the file) to the requesting host, and perform local accesses (called caching) Caching Issues: Where and when are file blocks cached? When are modifications propagated back to the remote file? What happens if multiple clients cache the same file? Lecture 25, page 11 Remote File Access and Caching! Location Local disk Advantages: Access time reduced. Safer if node fails. Disadvantages: Difficult to keep local copy consistent with remote copy. Slower than just keeping it in local memory. Requires client to have a disk. Lecture 25, page 12

Remote File Access and Caching! Location Local memory Advantages: Quick access time. Disadvantages: Difficult to keep local copy consistent with remote copy. Does not tolerate node failure well. Limited cache size. Works with diskless workstations. Lecture 25, page 13 Cache Update Policies! When to write local changes to the server has a central role in determining distributed file system performance. Write through: yields the most reliable results since every write hits the remote disk before the process continues, but it has the poorest performance. Caching with write through is equivalent to using remote service for all writes, and exploits caching only for reads. Write back: yields the quickest response time since the write need only hit cache before the process continues. It reduces network traffic and the number of writes to the disk for repeated writes to the same disk block, since only one of the writes will go across the network. If a user machine crashes, the unwritten data is lost. Write-back when file is closed, a block is evicted from cache, or every 30sec. Lecture 25, page 14

Cache Consistency! Client-initiated consistency: Client contacts the server and asks if its copy is consistent with the server's copy. Can check every access. Can check at a given interval. Can check only upon opening a file. Server-initiated consistency: Server detects potential conflicts and invalidates caches Server needs to know: which clients have cached which parts of which files. which clients are readers and which are writers. Lecture 25, page 15 Server State and Replication! Stateful versus stateless server Web analogy Tradeoff between performance and tolerence to crash faults Replication Server data is replicated across machines Need to ensure consistency of files when a file is updated on one server Lecture 25, page 16

Case Study: Sun's Network File System! NFS is the standard for distributed UNIX file access. NFS is designed to run on LANs (but works on WANs) Nodes are both servers and clients. Servers have no state (NFS v3 only; NFS v4 is stateful) Uses a mount protocol to make a global name local 1. /etc/exports lists the local names the server is willing to export. 2. /etc/fstab lists the global names that the local nodes import. A corresponding global name must be in /etc/exports on the server. Lecture 25, page 17 NFS Implementation! NFS defines a set of RPC operations for remote file access: 1. directory search, reading directory entries 2. manipulating links and directories 3. accessing file attributes 4. reading/writing files Does not rely on node homogeneity - heterogeneous nodes must simply support the NFS mount and remote access protocols using RPC. Users may need to know different names depending upon the node to which they logon. Lecture 25, page 18

NFS Implementation! Lecture 25, page 19 NFS Implementation! NFS defines new layers in the Unix file system The virtual file system provides a standard interface, using vnodes as file handles. A vnode describes either a local file or a remote file. The ``buffer cache'' caches remote file blocks and attributes. On an open, the client asks the server whether its cached blocks are up to date. Once a file is open, different clients can write to it and get inconsistent data. Modified data is flushed back to the server every 30s. What file contents do new clients see? Effects of last flush. Writers might have made changes but not updated remote file yet. What file contents do existing clients see? For cached blocks, they see out of date info. For new blocks, same as new client Lecture 25, page 20

Naming Summary! Desire name independence, but it is difficult to attain Location dependent names are most prevalent Speed up remote file access with caching Need to write changes back to disk Lecture 25, page 21