Distributed file systems

Similar documents
Distributed Systems. Hajussüsteemid MTAT Distributed File Systems. (slides: adopted from Meelis Roos DS12 course) 1/25

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

NFS: Naming indirection, abstraction. Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency

Distributed File Systems

Distributed File Systems. File Systems

Distributed File Systems. Distributed Systems IT332

Distributed File Systems. CS432: Distributed Systems Spring 2017

Operating Systems Design 16. Networking: Remote File Systems

DISTRIBUTED FILE SYSTEMS & NFS

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09

File-System Structure

Operating Systems CMPSC 473 File System Implementation April 10, Lecture 21 Instructor: Trent Jaeger

Module 7 File Systems & Replication CS755! 7-1!

Introduction. Distributed file system. File system modules. UNIX file system operations. File attribute record structure

Introduction. Chapter 8: Distributed File Systems

Introduction to the Network File System (NFS)

Lecture 7: Distributed File Systems

Network File Systems

Lecture 19. NFS: Big Picture. File Lookup. File Positioning. Stateful Approach. Version 4. NFS March 4, 2005

File Systems: Consistency Issues

Lecture 14: Distributed File Systems. Contents. Basic File Service Architecture. CDK: Chapter 8 TVS: Chapter 11

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Nov 28, 2017 Lecture 25: Distributed File Systems All slides IG

Chapter 11: File System Implementation

Chapter 11: File System Implementation

416 Distributed Systems. Distributed File Systems 2 Jan 20, 2016

Chapter 8: Distributed File Systems. Introduction File Service Architecture Sun Network File System The Andrew File System Recent advances Summary

Chapter 11: Implementing File Systems

Distributed File Systems. Distributed Computing Systems. Outline. Concepts of Distributed File System. Concurrent Updates. Transparency 2/12/2016

Announcements. P4: Graded Will resolve all Project grading issues this week P5: File Systems

Chapter 12 Distributed File Systems. Copyright 2015 Prof. Amr El-Kadi

Motivation: I/O is Important. CS 537 Lecture 12 File System Interface. File systems

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition,

Remote Procedure Call (RPC) and Transparency

Introduction to the Network File System (NFS)

File-System Interface

COT 4600 Operating Systems Fall Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM

Sharing may be done through a protection scheme. Network File System (NFS) is a common distributed file-sharing method

Chapter 11: Implementing File-Systems

What is a file system

AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi, Akshay Kanwar, Lovenish Saluja

Chapter 11: Implementing File Systems

Operating Systems. Week 13 Recitation: Exam 3 Preview Review of Exam 3, Spring Paul Krzyzanowski. Rutgers University.

CS 416: Operating Systems Design April 22, 2015

OPERATING SYSTEM. Chapter 12: File System Implementation

Chapter 11: File System Implementation

CSE 451: Operating Systems Winter Module 15 File Systems

Chapter 11: Implementing File Systems

Today: Distributed File Systems

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Chapter 12 File-System Implementation

TDDB68 Concurrent Programming and Operating Systems. Lecture: File systems

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

Cloud Computing CS

ECE 598 Advanced Operating Systems Lecture 19

The bigger picture. File systems. User space operations. What s a file. A file system is the user space implementation of persistent storage.

CSE325 Principles of Operating Systems. File Systems. David P. Duggan. March 21, 2013

Distributed File Systems I

DFS Case Studies, Part 1

Network File System (NFS)

Distributed Systems. Distributed File Systems. Paul Krzyzanowski

Network File System (NFS)

Operating Systems Design Exam 3 Review: Spring Paul Krzyzanowski

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files

Today: Distributed File Systems. Naming and Transparency

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

Chapter 7: File-System

File Concept Access Methods Directory and Disk Structure File-System Mounting File Sharing Protection

Today: Distributed File Systems!

CSE 486/586: Distributed Systems

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

Chapter 12: File System Implementation

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition,

CS 537: Introduction to Operating Systems Fall 2015: Midterm Exam #4 Tuesday, December 15 th 11:00 12:15. Advanced Topics: Distributed File Systems

Distributed File Systems. Directory Hierarchy. Transfer Model

Distributed File Systems. Jonathan Walpole CSE515 Distributed Computing Systems

CS-537: Final Exam (Spring 2011) The Black Box

Today: Distributed File Systems. File System Basics

CS6601 DISTRIBUTED SYSTEM / 2 MARK

Chapter 11: File-System Interface

Chapter 10: File System Implementation

File systems: management 1

Today: Distributed File Systems

Arvind Krishnamurthy Spring Implementing file system abstraction on top of raw disks

Buffer Management for XFS in Linux. William J. Earl SGI

Main Points. File systems. Storage hardware characteristics. File system usage patterns. Useful abstractions on top of physical devices

How do modules communicate? Enforcing modularity. Modularity: client-server organization. Tradeoffs of enforcing modularity

CSE 120 Principles of Operating Systems

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3.

Introduction to Cloud Computing

SOFTWARE ARCHITECTURE 11. DISTRIBUTED FILE SYSTEM.

Filesystems Lecture 13

C 1. Recap. CSE 486/586 Distributed Systems Distributed File Systems. Traditional Distributed File Systems. Local File Systems.

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

Logical disks. Bach 2.2.1

Chapter 11 DISTRIBUTED FILE SYSTEMS

RCU. ò Walk through two system calls in some detail. ò Open and read. ò Too much code to cover all FS system calls. ò 3 Cases for a dentry:

VFS, Continued. Don Porter CSE 506

Chapter 10: File-System Interface

Indirect Communication

OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD.

Transcription:

Distributed file systems Vladimir Vlassov and Johan Montelius KTH ROYAL INSTITUTE OF TECHNOLOGY

What s a file system Functionality: persistent storage of files: create and delete manipulating a file: read and write operations authorization: who is allowed to do what a naming and directory service The mapping of names to files is quite separate from the rest of the system. 2

Implementation directory file module access control file operations map from name to identifier locate file, harder in distributed systems interacts with authentication system read and write operations block operations device operations 3

What is a file a sequence of bytes attributes, associated meta-data size and type owner and permissions author created, last written icons, fonts, presentation... 4

Unix file operations create(name, mode) returns a file-descriptor open(name, mode) returns a file-descriptor close(fd) unlink(name) link(name, name) Can we separate the name service from the file operations? 5

Unix file operations read(fd, buffer, n) returns the number of bytes actually read write(fd, buffer, n) returns the number of bytes actually written lseek(fd, offset, set/cur/end) sets the current position in the file stat(name, buffer) reads the file attributes 6

Programming language API Operating system operations are not always directly available from a high level language. Buffering of write operations to reduce the number of system calls. 7

Descriptors, table entries and i-nodes A process holds a set of open file descriptors, each descriptor holds a pointer to a table of open files. The file table entries holds a position and a pointer to an inode (index node). The inode holds information about where file blocks are allocated. 8

Descriptors, table entries and i-nodes processes process 14 file table offset: 45 i-node i-nodes fd:0 fd:1 fd:2 access location size All read, write and lseek operations will change the current position in the table entry. 9

Two processes open the same file processes file table i-nodes process 14 fd:2 offset: 45 i-node process 21 fd:2 offset: 10 i-node access location size Two processes that open the same file will have independent file table entries. 10

Nota bene All threads in a process (or even if process is forked) share the same file descriptors and thus share the file table entry. 11

One-copy semantics Most file systems give us a one-copy semantics we expect operations to be visible by everyone and that everyone see the same file if I tell you that the file has been modified the modification should be visible 12

An architecture of a distributed file system Let s define the requirements on a distributed file system. Transparency - no difference between local and remote files access: same set operations location: same name space mobility: allowed to move files without changing client side performance: close to a nondistributed system Concurrency - simultaneous operations by several clients Heterogeneity - not locked in to a particular operating system Fault tolerance - independent of clients, restartable etc Consistency - one-copy semantics... or? Security - access control, who is allowed to do what 13

Distributed architecture Separate the directory service from the file service. Client side Server side Application layer Local FS Client module name resolution file operations Directory service Flat file service 14

The directory service The directory service - what operations do we need? lookup a file identifier given a name and directory link a name to a file identifier in a directory remove a link list all names in a directory The directory service does not create nor manipulate files. 15

The file service What operations should be provided? create a file and allocate a file identifier delete a file read from a file identified by a file identifier write a number of bytes to a file identified by a file identifier Do we need a open operation? What does open do in Unix? What do we need if we don t have an open operation? What would the benefit be? 16

A stateless server What are the benefits of a stateless server? What are the benefits of a stateless server? How can we maintain a session state while keeping the server stateless? 17

How do we handle security? In Unix, permissions are checked when a file is opened and access to the file can then be done without security control. If we do not have an open operation, how can we perform authentication and authorization control? 18

open(name,r) Client interface - open create a virtual i-node that keeps file-id for future operations lookup(name) file-id fd create a file table entry and return a local file descriptor 19

Client interface - write write(fd,buffer,i) lookup file-id, provider user-id and position write(user-id, file-id, pos, seq) ok true 20

Client interface - read read(fd,buffer,i) lookup file-id, provider user-id and position read(user-id, file-id, pos, count) seq true 21

Client interface - close close(name) remove file table entry fd 22

Performance Everything would be fine, if it was not for performance. Keep a local copy of the file at the client side. 23

Cashing - options Reading from a file: how do we know it is the most current? check validity when reading... if you haven t done so in a while server should invalidate copy Writing to a file: write-through: write to cache and server write-back : write to cache only write-around: write only to server Caching could break the one-copy semantics 24

NFS - Network File System developed by Sun, 1984 targeting department networks implemented using RPC (Open Network Computing) public API: RFC 1094, 1813, 3530 originally used UDP, later versions have support for TCP to improve performance over WAN mostly used with UNIX systems but client on all platforms available 25

NFS - client side caching Reading from a file first read will copy a segment (8kB) to the client the copy valid of a time t (3-30 sec) if more time has elapsed, the validity is checked again Writing to a file: write-back: write to the cache only schedule written segment to be copied to the server segment copied on timeout or when the file is closed (sync) The server is stateless 26

AFS - Andrew File System developed by Carnegie Mellon University clients for most platforms, OpenAFS (from IBM), Arla (a KTH implementation) used mainly in WAN (Internet) where the overhead of NFS would be prohibitive caching of whole files and infrequent sharing of writable files 27

AFS - client side caching Reading from a file copy the whole file from server (or 64kB) receive a call-back promise file is valid if promise exists and is not too old (minutes) Writing to a file: write-back: write to the cache only file copied to server when closed (sync) server will invalidate all existing promises 28

SMB/CIFS Service Message Block (SMB) was originally developed by IBM but then modified by Microsoft, now also under the name Common Internet File System (CIFS). not only file sharing but also name servers, printer sharing etc. Samba is an open source reimplementation of SMB by Andrew Tridgell 29

SMB/CIFS client side caching SMB uses client locks to solve cache consistency A client can open a file and lock it; all read and write operations in client cache A read only lock will allow multiple clients to cache and read a file Locks can be revoked by the server forcing the client to flush any changes In a unreliable or high latency network, locking can be dangerous and counter productive 30

Summary separate directory service from file service maintain a view of only one file, one-copy semantics caching is key to performance but could make the one-copy view hard to maintain 31