Today: Distributed File Systems

Similar documents
Today: Distributed File Systems!

Today: Distributed File Systems. Naming and Transparency

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09

NFS: Naming indirection, abstraction. Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency

Distributed Systems - III

Distributed File Systems

Lecture 7: Distributed File Systems

Background. 20: Distributed File Systems. DFS Structure. Naming and Transparency. Naming Structures. Naming Schemes Three Main Approaches

Advanced Operating Systems

DISTRIBUTED FILE SYSTEMS & NFS

Distributed File Systems. Directory Hierarchy. Transfer Model

Module 17: Distributed-File Systems

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition,

Distributed Systems. Lec 9: Distributed File Systems NFS, AFS. Slide acks: Dave Andersen

Lecture 14: Distributed File Systems. Contents. Basic File Service Architecture. CDK: Chapter 8 TVS: Chapter 11

Chapter 17: Distributed Systems (DS)

Module 17: Distributed-File Systems

Network File Systems

416 Distributed Systems. Distributed File Systems 2 Jan 20, 2016

Chapter 11: Implementing File-Systems

Chapter 11: File System Implementation

Chapter 18 Distributed Systems and Web Services

Chapter 11: File System Implementation

ELEC 377 Operating Systems. Week 9 Class 3

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3.

Chapter 11: Implementing File Systems

Lecture 16/17: Distributed Shared Memory. CSC 469H1F Fall 2006 Angela Demke Brown

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

Distributed File Systems. Distributed Systems IT332

Lecture 19. NFS: Big Picture. File Lookup. File Positioning. Stateful Approach. Version 4. NFS March 4, 2005

Today: Distributed File Systems. File System Basics

Today: Distributed File Systems

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Nov 28, 2017 Lecture 25: Distributed File Systems All slides IG

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition,

Chapter 11: Implementing File Systems

CSE 486/586: Distributed Systems

Chapter 12: File System Implementation

Cloud Computing CS

Lecture 15: Network File Systems

OPERATING SYSTEM. Chapter 12: File System Implementation

Chapter 12: File System Implementation. Operating System Concepts 9 th Edition

Chapter 12: File System Implementation

Chapter 11: Implementing File Systems

Chapter 12 File-System Implementation

CS307: Operating Systems

File-System Structure

Chapter 10: File System Implementation

Bhaavyaa Kapoor Person #

OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD.

AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi, Akshay Kanwar, Lovenish Saluja

C 1. Recap. CSE 486/586 Distributed Systems Distributed File Systems. Traditional Distributed File Systems. Local File Systems.

Operating Systems Design 16. Networking: Remote File Systems

Distributed File Systems. CS432: Distributed Systems Spring 2017

Operating Systems CMPSC 473 File System Implementation April 10, Lecture 21 Instructor: Trent Jaeger

Chapter 11: Implementing File

Distributed File Systems Part II. Distributed File System Implementation

Distributed Systems. Distributed File Systems. Paul Krzyzanowski

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

Distributed File Systems I

Distributed file systems

Service and Cloud Computing Lecture 10: DFS2 Prof. George Baciu PQ838

Cloud Computing CS

Week 12: File System Implementation

Introduction. Chapter 8: Distributed File Systems

Distributed File Systems. Case Studies: Sprite Coda

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems!

Chapter 20: Database System Architectures

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

Operating Systems 2010/2011

Lecture 23 Database System Architectures

CSCI 4717 Computer Architecture

CA464 Distributed Programming

EECS 482 Introduction to Operating Systems

Filesystems Lecture 13

Today: Coda, xfs! Brief overview of other file systems. Distributed File System Requirements!

Database Management Systems

416 Distributed Systems. Distributed File Systems 1: NFS Sep 18, 2018

Computer Organization. Chapter 16

Distributed Systems. Hajussüsteemid MTAT Distributed File Systems. (slides: adopted from Meelis Roos DS12 course) 1/25

CS4513 Distributed Computer Systems

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Today: Protection! Protection!

Filesystems Lecture 11

06-Dec-17. Credits:4. Notes by Pritee Parwekar,ANITS 06-Dec-17 1

Distributed Operating Systems Spring Prashant Shenoy UMass Computer Science.

Distributed Systems. Overview. Distributed Systems September A distributed system is a piece of software that ensures that:

Distributed Systems. Thoai Nam Faculty of Computer Science and Engineering HCMC University of Technology

Introduction to Cloud Computing

CSE380 - Operating Systems

Chapter 12 Distributed File Systems. Copyright 2015 Prof. Amr El-Kadi

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

Goals. Facebook s Scaling Problem. Scaling Strategy. Facebook Three Layer Architecture. Workload. Memcache as a Service.

3/4/14. Review of Last Lecture Distributed Systems. Topic 2: File Access Consistency. Today's Lecture. Session Semantics in AFS v2

Frequently asked questions from the previous class survey

Organisasi Sistem Komputer

Extreme computing Infrastructure

Lecture 12: Hardware/Software Trade-Offs. Topics: COMA, Software Virtual Memory

Network File System (NFS)

CS454/654 Midterm Exam Fall 2004

Transcription:

Last Class: Distributed Systems and RPCs Servers export procedures for some set of clients to call To use the server, the client does a procedure call OS manages the communication Lecture 22, page 1 Today: Distributed File Systems One of the most common uses of distributed systems (and RPCs) Basic idea: Given a set of disks attached to different nodes. share disks between nodes as if all the disks were attached to every node. Examples: Edlab: One server node with all the disks, and a bunch of diskless workstations on a LAN. AppleShare: Every node is both a server with a disk and a client. Lecture 22, page 2

Distributed File Systems: Issues Naming and transparency Remote file access Caching Stateless or stateful servers Lecture 22, page 3 Naming and Transparency Issues How are files named? Do file names reveal their location? Do file names change if the file moves? Do file names change if the user moves? Location transparency: the name of the file does not reveal the physical storage location. Location independence: the name of the file need not change if the file's storage location changes. Most naming schemes used in practice do not have location independence, but many have location transparency. Lecture 22, page 4

Naming Strategies: Absolute Names Absolute names: <machine name: path name> Examples: AppleShare, Win NT Advantages: Finding a fully specified file name is simple. It is easy to add and delete new names. No global state. Scales easily. Disadvantages: User must know the complete name and is aware of which files are local and which are remote. File is location dependent, and thus cannot move. Makes sharing harder. Not fault tolerant. Lecture 22, page 5 Naming Strategies: Mount Points Mount Points (NFS - Sun's Network File System) Each host has a set of local names for remote locations. Each host has a mount table (/etc/fstab) that specifies <remote path name @ machine name> and a <local path name>. At boot time, the local name is bound to the remote name. Users then refer to the <local path name> as if it were local, and the NFS takes care of the mapping Advantages: location transparent, remote name can change across reboots Disadvantages: single unified strategy hard to maintain, same file can have different names Lecture 22, page 6

NFS: Example Partial contents of /etc/fstab for Edlab machines: /usr1/mail@elux3.cs.umass.edu:/var/spool/mail /users/users1@elsrv1:/users/users1 /users/users2@elsrv1:/users/users2 /users/users3@elsrv2:/users/users3 /users/users4@elsrv2:/users/users4 /courses/cs300@elsrv3:/courses/cs300 /rcf/mipsel/4.2/share@elsrv1:/exp/rcf/share /rcf/common@elsrv1:/exp/rcf/common Lecture 22, page 7 NFS: Example Lecture 22, page 8

Naming Strategies: Global Name Space Single name space: CMU's Andrew and Berkeley's Sprite No matter which node you are on, the file names are the same. Set of workstation clients, and a set of dedicated file server machines. When a client starts up, it gets its file name structure from a server. As users access files, the server sends copies to the workstation and the workstation caches the files E.g., /afs/fileserver.cs.umass.edu/usr/data/file.txt Lecture 22, page 9 Global Name Space Advantages: Naming is consistent and easy to keep consistent. The global name space insures all the files are the same regardless of where you login. Since names are bound late, moving them is easier. Disadvantages: It is difficult for the OS to keep file contents consistent due to caching. Global name space may limit flexibility. Performance problems. Lecture 22, page 10

Remote File Access and Caching Once the user specifies a remote file, the OS can do the access either 1. remotely, on the server machine and then return the results using RPC (remote access model), or 2. can transfer the file (or part of the file) to the requesting host, and perform local accesses (caching model) Caching Issues: Where and when are file blocks cached? When are modifications propagated back to the remote file? What happens if multiple clients cache the same file? Lecture 22, page 11 Remote File Access and Caching Location Local memory Advantages: Fastest access time. Works with diskless workstations. Disadvantages: Difficult to keep local copy consistent with remote copy. Does not tolerate node failure well. Limited cache size. Lecture 22, page 12

Remote File Access and Caching Location Local disk Advantages: Safer if node fails. Disadvantages: Difficult to keep local copy consistent with remote copy. Slower than just keeping it in local memory. Requires client to have a disk. Lecture 22, page 13 Cache Update Policies When to write local changes to the server has a central role in determining distributed file system performance. Write through: yields the most reliable results since every write hits the remote disk before the process continues, but it has the poorest performance. Caching with write through is equivalent to using remote service for all writes, and exploits caching only for reads. Write back: yields the quickest response time since the write need only hit cache before the process continues. It reduces network traffic and the number of writes to the disk for repeated writes to the same disk block, since only one of the writes will go across the network. If a user machine crashes, the unwritten data is lost. Write-back when file is closed, a block is evicted from cache, or every 30sec. Lecture 22, page 14

Cache Consistency Client-initiated consistency: Client contacts the server and asks if its copy is consistent with the server's copy. Can check every access. Can check at a given interval. Can check only upon opening a file. Server-initiated consistency: Server detects potential conflicts and invalidates caches Server needs to know: which clients have cached which parts of which files. which clients are readers and which are writers. Lecture 22, page 15 Server State and Replication Stateful versus stateless server Web analogy Tradeoff between performance and tolerance to crash faults Replication Server data is replicated across machines Need to ensure consistency of files when a file is updated on one server Lecture 22, page 16

Case Study: Sun's Network File System (NFS) NFS is the standard for distributed UNIX file access. NFS is designed to run on LANs (but works on WANs) Nodes are both servers and clients. Servers have no state (NFS v3 only; NFS v4 is stateful) Uses a mount protocol to make a global name local /etc/exports lists the local names the server is willing to export. /etc/fstab lists the global names that the local nodes import. A corresponding global name must be in /etc/exports on the server. Lecture 22, page 17 NFS Implementation NFS defines a set of RPC operations for remote file access: 1. directory search, reading directory entries 2. manipulating links and directories 3. accessing file attributes 4. reading/writing files Does not rely on node homogeneity - heterogeneous nodes must simply support the NFS mount and remote access protocols using RPC. Users may need to know different names depending upon the node to which they logon. Lecture 22, page 18

NFS Implementation Lecture 22, page 19 NFS Implementation The ``buffer cache'' caches remote file blocks and attributes. On an open, the client asks the server whether its cached blocks are up to date. Once a file is open, different clients can write to it and get inconsistent data. Modified data is flushed back to the server every 30s. What file contents do new clients see? Effects of last flush. Writers might have made changes but not updated remote file yet. What file contents do existing clients see? For cached blocks, they see out of date info. For new blocks, same as new client NFS v4: stateful protocol Leased file locks given to clients Allows more work to proceed locally without talking to NFS server Lecture 22, page 20

Summary Distributed file systems one of the most common examples of distributed systems Naming Desire name independence, but it is difficult to attain Location dependent names are most prevalent Speed up remote file access with caching Need to write changes back to disk and address cache consistency Lecture 22, page 21