Lecture 15: Network File Systems

Similar documents
CS153: Communication. Chengyu Song. Slides modified from Harsha Madhyvasta, Nael Abu-Ghazaleh, and Zhiyun Qian

Week 11/12: Distributed File Systems & RPC Reading: Text, Chapter 8.2.4, 8.3,

RPC. Remote Procedure Calls. Robert Grimm New York University

Lecture 5: Object Interaction: RMI and RPC

Lecture 06: Distributed Object

CHAPTER - 4 REMOTE COMMUNICATION

Remote Procedure Call (RPC) and Transparency

CSci Introduction to Distributed Systems. Communication: RPC

Distributed Systems 8. Remote Procedure Calls

Today: Distributed Objects. Distributed Objects

Remote Procedure Call

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 22: Remote Procedure Call (RPC)

Distributed Objects and Remote Invocation. Programming Models for Distributed Applications

Communication. Distributed Systems Santa Clara University 2016

416 Distributed Systems. RPC Day 2 Jan 12, 2018

416 Distributed Systems. RPC Day 2 Jan 11, 2017

Distributed Systems. How do regular procedure calls work in programming languages? Problems with sockets RPC. Regular procedure calls

Outline. EEC-681/781 Distributed Computing Systems. The OSI Network Architecture. Inter-Process Communications (IPC) Lecture 4

RPC Paradigm. Lenuta Alboaie Andrei Panu

Verteilte Systeme (Distributed Systems)

CSCI-1680 RPC and Data Representation. Rodrigo Fonseca

Operating Systems. 18. Remote Procedure Calls. Paul Krzyzanowski. Rutgers University. Spring /20/ Paul Krzyzanowski

Chapter 5: Distributed objects and remote invocation

Distributed Systems. The main method of distributed object communication is with remote method invocation

Dr. Robert N. M. Watson

Remote Invocation. Today. Next time. l Overlay networks and P2P. l Request-reply, RPC, RMI

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

COMMUNICATION PROTOCOLS: REMOTE PROCEDURE CALL (RPC)

Chapter 3: Client-Server Paradigm and Middleware

CSCI-1680 RPC and Data Representation. Rodrigo Fonseca

MODELS OF DISTRIBUTED SYSTEMS

Advanced Topics in Distributed Systems. Dr. Ayman A. Abdel-Hamid. Computer Science Department Virginia Tech

Distributed Objects. Object-Oriented Application Development

Distributed Systems Principles and Paradigms. Distributed Object-Based Systems. Remote distributed objects. Remote distributed objects

CSCI-1680 RPC and Data Representation John Jannotti

Distributed Systems Lecture 2 1. External Data Representation and Marshalling (Sec. 4.3) Request reply protocol (failure modes) (Sec. 4.

Distributed Object-Based Systems The WWW Architecture Web Services Handout 11 Part(a) EECS 591 Farnam Jahanian University of Michigan.

Chapter 4 Communication

RPC flow. 4.3 Remote procedure calls IDL. RPC components. Procedure. Program. sum (j,k) int j,k; {return j+k;} i = sum (3,7); Local procedure call

Network File System (NFS)

DISTRIBUTED FILE SYSTEMS & NFS

Distributed Systems Theory 4. Remote Procedure Call. October 17, 2008

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Remote Procedure Call. Chord: Node Joins and Leaves. Recall? Socket API

Today: Distributed File Systems

DISTRIBUTED PROCESSING SOFTWARE ENVIRONMENTS

MODELS OF DISTRIBUTED SYSTEMS

Message Passing vs. Distributed Objects. 5/15/2009 Distributed Computing, M. L. Liu 1

Distributed Middleware. Distributed Objects

Distributed File Systems

CS603: Distributed Systems

Advanced Distributed Systems

Desarrollo de Aplicaciones en Red. El modelo de comunicación. General concepts. Models of communication. Message Passing

Overview. Communication types and role of Middleware Remote Procedure Call (RPC) Message Oriented Communication Multicasting 2/36

Lightweight RPC. Robert Grimm New York University

Today: Distributed File Systems!

File-System Structure

Notes. Submit homework on Blackboard The first homework deadline is the end of Sunday, Feb 11 th. Final slides have 'Spring 2018' in chapter title

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09

CAS 703 Software Design

Operating System Support

Distributed Information Processing

The RPC abstraction. Procedure calls well-understood mechanism. - Transfer control and data on single computer

5 Distributed Objects: The Java Approach

CS454/654 Midterm Exam Fall 2004

Distributed Systems. Chapter 02

Distributed Systems. 5. Remote Method Invocation

How do modules communicate? Enforcing modularity. Modularity: client-server organization. Tradeoffs of enforcing modularity

Chapter 11: Implementing File-Systems

Page 1. Goals for Today" Remote Procedure Call" Raw messaging is a bit too low-level for programming! RPC Details"

Chapter 11: File System Implementation

Remote Invocation. Today. Next time. l Indirect communication. l Request-reply, RPC, RMI

Today CSCI Communication. Communication in Distributed Systems. Communication in Distributed Systems. Remote Procedure Calls (RPC)

DISTRIBUTED COMPUTER SYSTEMS

Chapter 11: File System Implementation

Last Class: RPCs. Today:

Chapter 11: Implementing File Systems

RMI: Design & Implementation

CORBA (Common Object Request Broker Architecture)

EECS 482 Introduction to Operating Systems

DS 2009: middleware. David Evans

Remote Procedure Calls

Today: Distributed File Systems. Naming and Transparency

Remote Procedure Call

Large Systems: Design + Implementation: Communication Coordination Replication. Image (c) Facebook

a. Under overload, whole network collapsed iii. How do you make an efficient high-level communication mechanism? 1. Similar to using compiler instead

Communication. Overview

Electronic Payment Systems (1) E-cash

Verteilte Systeme (Distributed Systems)

416 Distributed Systems. Distributed File Systems 1: NFS Sep 18, 2018

Chapter 5 Distributed Objects and Remote Invocation

Distributed Systems Middleware

User-level file systems

CS 403/534 Distributed Systems Midterm April 29, 2004

Chapter 11: Implementing File Systems

15-498: Distributed Systems Project #1: Design and Implementation of a RMI Facility for Java

Lecture 8: February 19

Two Phase Commit Protocol. Distributed Systems. Remote Procedure Calls (RPC) Network & Distributed Operating Systems. Network OS.

CSE 358 Spring 2006 Selected Notes Set 8

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Advanced Topics in Operating Systems

Transcription:

Lab 3 due 12/1 Lecture 15: Network File Systems CSE 120: Principles of Operating Systems Alex C. Snoeren

Network File System Simple idea: access disks attached to other computers Share the disk with many other machines as well NFS is a protocol for remote access to a file system Does not implement a file system per se Remote access is transparent to applications File system, OS, and architecture independent Originally developed by Sun Although Unix-y in flavor, explicit goal to work beyond Unix Client/server architecture Local file system requests are forwarded to a remote server Requests are implemented as remote procedure calls (RPCs) 2

Remote Procedure Call RPC is common means for remote communication It is used both by operating systems and applications NFS is implemented as a set of RPCs DCOM, CORBA, Java RMI, etc., are all basically just RPC Someday (soon?) you will most likely have to write an application that uses remote communication (or you already have) You will most likely use some form of RPC for that remote communication So it s good to know how all this RPC stuff works Let s focus on RPC before getting back to NFS Once we have RPC, NFS isn t too hard to build 3

Clients and Servers The prevalent model for structuring distributed computation is the client/server paradigm A server is a program (or collection of programs) that provide a service (file server, name service, etc.) The server may exist on one or more nodes Often the node is called the server, too, which is confusing A client is a program that uses the service A client first binds to the server (locates it and establishes a connection to it) A client then sends requests, with data, to perform actions, and the servers sends responses, also with data 4

Messages Initially with network programming, people hand-coded messages to send requests and responses Hand-coding messages gets tiresome Need to worry about message formats Have to pack and unpack data from messages Servers have to decode and dispatch messages to handlers Messages are often asynchronous Messages are not a very natural programming model Could encapsulate messaging into a library Just invoke library routines to send a message Which leads us to RPC 5

Procedure Calls Procedure calls are a more natural way to communicate Every language supports them Semantics are well-defined and understood Natural for programmers to use Idea: Have servers export a set of procedures that can be called by client programs Similar to module interfaces, class definitions, etc. Clients just do a procedure call as it they were directly linked with the server Under the covers, the procedure call is converted into a message exchange with the server 6

Remote Procedure Calls So, we would like to use procedure call as a model for distributed (remote) communication Lots of issues How do we make this invisible to the programmer? What are the semantics of parameter passing? How do we bind (locate, connect to) servers? How do we support heterogeneity (OS, arch, language)? How do we make it perform well? 7

RPC Model A server defines the server s interface using an interface definition language (IDL) The IDL specifies the names, parameters, and types for all client-callable server procedures A stub compiler reads the IDL and produces two stub procedures for each server procedure (client and server) The server programmer implements the server procedures and links them with the server-side stubs The client programmer implements the client program and links it with the client-side stubs The stubs are responsible for managing all details of the remote communication between client and server 8

RPC Stubs A client-side stub is a procedure that looks to the client as if it were a callable server procedure A server-side stub looks to the server as if a client called it The client program thinks it is calling the server In fact, it s calling the client stub The server program thinks it is called by the client In fact, it s called by the server stub The stubs send messages to each other to make the RPC happen transparently 9

RPC Example Server Interface: int Add(int x, int y); Client Program: sum = server->add(3,4); Server Program: int Add(int x, int, y) { return x + y; } If the server were just a library, then Add would just be a procedure call 10

RPC Example: Call Client Program: sum = server->add(3,4); Client Stub: Int Add(int x, int y) { Alloc message buffer; Mark as Add call; Store x, y into buffer; Send message; } RPC Runtime: Send message to server; Server Program: int Add(int x, int, y) {} Server Stub: Add_Stub(Message) { Remove x, y from buffer r = Add(x, y); } RPC Runtime: Receive message; Dispatch, call Add_Stub; 11

RPC Example: Return Client Program: sum = server->add(3,4); Client Stub: Int Add(int x, int y) { Create, send message; Remove r from reply; return r; } Server Program: int Add(int x, int, y) {} Server Stub: Add_Stub(Message) { Remove x, y from buffer r = Add(x, y); Store r in buffer; } RPC Runtime: Return reply to stub; RPC Runtime: Send reply to client; 12

RPC Marshalling Marshalling is the packing of procedure parameters into a message packet The RPC stubs call type-specific procedures to marshal (or unmarshal) the parameters to a call The client stub marshals the parameters into a message The server stub unmarshals parameters from the message and uses them to call the server procedure On return The server stub marshals the return parameters The client stub unmarshals return parameters and returns them to the client program 13

RPC Binding Binding is the process of connecting the client to the server The server, when it starts up, exports its interface Identifies itself to a network name server Tells RPC runtime it s alive and ready to accept calls The client, before issuing any calls, imports the server RPC runtime uses the name server to find the location of a server and establish a connection The import and export operations are explicit in the server and client programs Breakdown of transparency 14

RPC Transparency One goal of RPC is to be as transparent as possible Make remote procedure calls look like local procedure calls We have seen that binding breaks transparency What else? Failures remote nodes/networks can fail in more ways than with local procedure calls» Need extra support to handle failures well Performance remote communication is inherently slower than local communication» If program is performance-sensitive, could be a problem Now we re ready to continue talking about NFS 15

NFS Binding: Mount Before a client can access files on a server, the client must mount the file system on the server The file system is mounted on an empty local directory Same way that local file systems are attached Can depend on OS (e.g., Unix dirs vs NT drive letters) Servers maintain ACLs of clients that can mount their directories When mount succeeds, server returns a file handle Clients use this file handle as a capability to do file operations Mounts can be cascaded Can mount a remote file system on a remote file system 16

NFS Protocol The NFS protocol defines a set of operations that a server must support Reading and writing files Accessing file attributes Searching for a file within a directory Reading a set of directory links Manipulating links and directories These operations are implemented as RPCs Usually by daemon processes (e.g., nfsd) A local operation is transformed into an RPC to a server Server performs operation on its own file system and returns 17

Statelessness Note that NFS has no open or close operations NFS is stateless An NFS server does not keep track of which clients have mounted its file systems or are accessing its files Each RPC has to specify all information in a request Robust» Operation, FS handle, file id, offset in file, sequence # No reconciliation needs to be done on a server crash/reboot Clients detect server reboot, continue to issue requests Writes must be synchronous to disk, though Clients assume that a write is persistent on return Servers cannot cache writes 18

Consistency Since NFS is stateless, consistency is tough NFS can be (mostly) consistent, but limits performance NFS assumes that if you want consistency, applications will use higher-level mechanisms to guarantee it Writes are supposed to be atomic But performed in multiple RPCs (larger than a network packet) Simultaneous writes from clients can interleave RPCs (bad) Server caching Can cache reads, but we saw that it cannot cache writes 19

Consistency (2) Client caching can lead to consistency problems Caching a write on client A will not be seen by other clients Cached writes by clients A and B are unordered at server Since sharing is rare, though, NFS clients usually do cache NFS statelessness is both its key to success and its Achilles heel NFS is straightforward to implement and reason about But limitations on caching can severely limit performance» Dozens of network file system designs and implementations that perform much better than NFS But note that it is still the most widely used remote file system protocol and implementation 20

Summary NFS allows access to file systems on other machines Implementation uses RPC to access file server RPC enables transparent distribution of applications Also used on same node between applications RPC islanguage support for distributed programming Examples include DCOM, CORBA, Java RMI, etc. What else have we seen use language support? Stub compiler automatically generates client/server code from the IDL server descriptions These stubs do the marshalling/unmarshalling, message sending/receiving/replying Programmer just implements functions as before 21