Distributed Systems Distributed Objects and Distributed File Systems

Size: px
Start display at page:

Download "Distributed Systems Distributed Objects and Distributed File Systems"

Transcription

1 Distributed Systems Distributed Objects and Distributed File Systems Learning Goals: Understand Distributed Object System (Short) Understand Amazon Simple Storage Service Understand Distributed File System Understand NFS Network File System Developed at Sun (1984) Understand AFS Andrew File System Developed at CMU (1980 s) Compare the above with Google File System (GFS) (2004) Be able to work with HDFS the Java based open source Hadoop Distributed File System(2008) Understand and be able to program with MapReduce Distributed Systems Coulouris 5 th Ed. 1

2 Distributed Objects Distributed Systems Coulouris 5 th Ed. 2

3 The traditional object model (OOP 101) Each object is a set of data and a set of methods. Object references are assigned to variables. Interfaces define an object s methods. Actions are initiated by invoking methods. Exceptions may be thrown for unexpected or illegal conditions. Garbage collection may be handled by the developer (C++) or by the runtime (.NET and Java). We have inheritance and polymorphism. We want similar features in the distributed case Distributed Systems Coulouris 5 th Ed. 3

4 The distributed object model Having client and server objects in different processes enforces encapsulation. You must call a method to change its state. Methods may be synchronized to protect against conflicting access by multiple clients. Objects are accessed remotely through RMI or objects are copied to the local machine (if the object s class is available locally) and used locally. Remote object references are analogous to local ones in that: 1. The invoker uses the remote object reference to identify the object and 2. The remote object reference may be passed as an argument to or return value from a local or remote method Distributed Systems Coulouris 5 th Ed. 4

5 A remote object and its remote interface remoteobject Data remote interface { m1 implementation m2 m3 of methods m4 m5 m6 Enterprise Java Beans (EJB s) provide a remote and local interface. EJB s are a component based middleware technology. EJB s live in a container. The container provides a managed server side hosting environment ensuring that non-functional properties are achieved. Middleware supporting the container pattern is called an application server. Quiz: Describe some non-functional concerns that would be handled by the application server. Java RMI presents us with plain old distributed objects. Fowler s First Law of Distributed Objects: Don t distribute your objects Distributed Systems Coulouris 5 th Ed Distributed Systems Distributed Objects & Java RMI 5

6 Generic RMI object A client proxy for B Request server skeleton & dispatcher for B s class remote object B Reply Remote Communication reference module module Communication module Remote reference module Distributed Systems Coulouris 5 th Ed. 6

7 Registries promote space decoupling Java uses the rmiregistry Binders allow an object to be named and registered. CORBA uses the CORBA Naming Service object A client proxy for B Request server skeleton & dispatcher for B s class remote object B Reply Remote Communication reference module module Communication module Remote reference module Before interacting with the remote object, the RMI registry is used Distributed Systems Coulouris 5 th Ed. 7

8 Registries promote space decoupling Distributed Systems Coulouris 5 th Ed. Slide from Sun Microsystems Java RMI 8 documentation

9 Distributed Objects : Takeaways Full OOP implies more complexity than web services. Full OOP implies more complexity than RPC. Remember Martin Fowler s First Law of Distributed Objects Distributed Systems Coulouris 5 th Ed. 9

10 Amazon Simple Storage Service (S3) Remote object storage. An object is simply a piece of data in no specific format. Not like the objects described in RMI. Accessible via REST PUT, GET, DELETE Each object has data, a key, and user metadata (name value pairs or your choosing) and system metadata (e.g. time of creation) Objects are placed into buckets. Buckets do not contain sub-buckets. Objects may be versioned (same object, different version, in same bucket) Objects keys are unique in a bucket and in a flat namespace. Each object has a storage class (high or low performance requirements) Distributed Systems Coulouris 5 th Ed. 10

11 Amazon Simple Storage Service (S3) Use Cases Backup most popular. Three replicas over several regions. Infrequently accessed data and archival storage. Data for a static web site. Source of video streamed data. Source and destination of data for big data applications running on Amazon EC2 Challenge: No locking: S3 provides no capability to serialize access to data. The user application is responsible for ensuring that multiple PUT requests for the same object do not clash with each other Distributed Systems Coulouris 5 th Ed. 11

12 Amazon Simple Storage Service (S3) Consistency Model If you PUT a new object to S3. A subsequent read will see the new object. If you overwrite an existing object with PUT, it will eventually be reflected elsewhere. A read after a write may see the old value. If you delete an old object, it will eventually be removed. It may briefly appear to be still present after a delete. Amazon S3 is Available and tolerates Partitions between replicas but is only eventually Consistent. This is the A and P in the CAP theorem Distributed Systems Coulouris 5 th Ed. 12

13 Amazon Simple Storage Service (S3) From Amazon An HTTP PUT to a bucket with versioning turned on Distributed Systems Coulouris 5 th Ed. 13

14 Amazon Simple Storage Service (S3) from Amazon An HTTP DELETE on a bucket with versioning turned on. The delete marker becomes the current version of the object Distributed Systems Coulouris 5 th Ed. Before call to delete After call to delete The delete marker becomes current version of the object 14

15 Amazon Simple Storage Service (S3) from Amazon An HTTP Delete on a bucket with versioning turned on. Now, if we call GET Distributed Systems Coulouris 5 th Ed. We get a 404 not found. 15

16 Amazon Simple Storage Service (S3) From Amazon We can GET a specific version. Say we GET with ID = We can also delete permanently by including a version number in the DELETE request Distributed Systems Coulouris 5 th Ed. 16

17 Amazon Simple Storage Service (S3) From Amazon A Java client accesses the data in an object stored on S3. AmazonS3 s3client = new AmazonS3Client(new ProfileCredentialsProvider()); S3Object object = s3client.getobject( new GetObjectRequest(bucketName, key)); InputStream objectdata = object.getobjectcontent(); // Call a method to read the objectdata stream. display(objectdata); objectdata.close(); Quiz: Where is all of the HTTP? You may share objects with others. Provide others with a signed URL. They may access for a specific period of time Distributed Systems Coulouris 5 th Ed. 17

18 Amazon Simple Storage Service (S3) From Amazon A Java client writes data to an object on S3. AmazonS3 s3client = new AmazonS3Client(new ProfileCredentialsProvider()); s3client.putobject(new PutObjectRequest(bucketName, keyname, file)); Distributed Systems Coulouris 5 th Ed. 18

19 Distributed File Systems Distributed Systems Coulouris 5 th Ed. 19

20 Figure 12.2 File system modules filedes = open( CoolData\text.txt, r ); count = read(filedes, buffer, n) Directory module: relates file names to file IDs File module: Access control module: File access module: Block module: Device module: relates file IDs to particular files checks permission for operation requested reads or writes file data or attributes accesses and allocates disk blocks disk I/O and buffering A typical non-distributed file system s layered organization. Each layer depends only on the layer below it. Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

21 Figure 12.3 File attribute record structure File length Creation timestamp Read timestamp Write timestamp Attribute timestamp Reference count Owner File type Access control list Files contain both data and attributes. The shaded attributes are managed by the file system and not normally directly modified by user programs. Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

22 Figure 12.4 UNIX file system operations filedes = open(name, mode) filedes = creat(name, mode) status = close(filedes) count = read(filedes, buffer, n) count = write(filedes, buffer, n) pos = lseek(filedes, offset, whence) status = unlink(name) status = link(name1, name2) status = stat(name, buffer) Opens an existing file with the given name. Creates a new file with the given name. Both operations deliver a file descriptor referencing the open file. The mode is read, write or both. Closes the open file filedes. Transfers n bytes from the file referenced by filedes to buffer. Transfers n bytes to the file referenced by filedes from buffer. Both operations deliver the number of bytes actually transferred and advance the read-write pointer. Moves the read-write pointer to offset (relative or absolute, depending on whence). Removes the file name from the directory structure. If the file has no other names, it is deleted. Adds a new name (name2) for a file (name1). Gets the file attributes for file name into buffer. These operations are implemented in the Unix kernel. These are operations available in the non-distributed case. Programs cannot observer any discrepancies between cached copies and stored data after an update. This is called strict one copy semantics. Suppose we want the files to be be located on another machine Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

23 Figure 12.5 File service architecture Generic Distributed File System Client computer Server computer Application program Application program Directory service Flat file service Client module The client module provides a single interface used by apps emulates traditional fs. Flat file service and dir. service both provide an RPC interface used by clients. Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

24 Figure 12.6 Flat file service operations Read(FileId, i, n) -> Data throws BadPosition Write(FileId, i, Data) throws BadPosition Create() -> FileId Delete(FileId) GetAttributes(FileId) -> Attr SetAttributes(FileId, Attr) If 1 i Length(File): Reads a sequence of up to n items from a file starting at item i and returns it in Data. If 1 i Length(File)+1: Writes a sequence of Data to a file, starting at item i, extending the file if necessary. Creates a new file of length 0 and delivers a UFID for it. Removes the file from the file store. Returns the file attributes for the file. Sets the file attributes (only those attributes that are not shaded in Figure 12.3). The client module will make calls on these operations and so will the directory service act as a client of the flat file service. Unique File Identifiers (UFID s) are passed in on all operations except create(). Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

25 Figure 12.5 File service architecture Generic Distributed File System Client computer Server computer Application program Application program Client module Directory service read(fileid,.. write(fileid, Flat file service fileid create( delete(fileid, getattributes(fileid setattribues(fileid Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

26 Figure 12.7 Directory service operations Lookup(Dir, Name) -> FileId throws NotFound AddName(Dir, Name, FileId) throws NameDuplicate UnName(Dir, Name) throws NotFound GetNames(Dir, Pattern) -> NameSeq Locates the text name in the directory and returns the relevant UFID. If Name is not in the directory, throws an exception. If Name is not in the directory, adds (Name, File) to the directory and updates the file s attribute record. If Name is already in the directory: throws an exception. If Name is in the directory: the entry containing Name is removed from the directory. If Name is not in the directory: throws an exception. Returns all the text names in the directory that match the regular expression Pattern. Primary purpose: translate text names to UFID s. Each directory is stored as a conventional file and so this is a client of the flat file service. Once a flat file service and directory service is in place, it is simple matter to build client modules that look like unix. Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

27 Figure 12.5 File service architecture Generic Distributed File System Application program Client computer Application program Server computer fileid lookup(dir,name) Directory service addname(dir,name,fileid) unnameid(dir,name) getnames(dir, pattern) Flat file service Client module We have seen this pattern before. Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

28 Figure 12.5 File service architecture Generic Distributed File System Client computer Server computer Application program Application program name Directory service fileid Flat file service Client module operation fileid data or status Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

29 Two models for distributed file system The remote access model The upload/download model Figure 11-1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

30 File sharing semantics UNIX semantics or strict one copy : a read after a write gets the value just written. (Figure a) Sesssion semantics: changes are initially visible only to the process that modifies the file. Changes become visible when the file is closed. (Figure b) For session semantics, you might adopt last writer wins or transactions. Transactions makes concurrent access appear as serial. Tanenbaum and Steen Distributed Systems

31 NFS Goal: Be unsurprising and look like a UNIX FS. Goal: Implement full POSIX API. The Portable Operating System Interface is an IEEE family of standards that describe how Unix like Operating Systems should behave. Goal: Your files are available from any machine. Goal: Distribute the files and we will not have to implement new protocols. NFS has been a major success. NFS was originally based on UDP and was stateless. TCP added later. NFS defines a virtual file system. The NFS client pretends to be a real file system but is making RPC calls instead. 31

32 To deal with concurrent access, NFS V4 supports clients that require locks on files A client informs the server of intent to lock Server may not grant lock if already held If granted the client gets a lease (say 3 minutes) If a client dies while holding a lock, its lease will expire. They client may renew leases before old lease expires. 32

33 Figure 12.8 NFS architecture Client computer Server computer UNIX system calls UNIX kernel Application program Application program Virtual file system UNIX kernel Virtual file system Local Remote UNIX file system Other file system NFS client NFS protocol NFS server UNIX file system NFS uses RPC over TCP or UDP. External requests are translated into RPC calls on the server. The virtual file system module provides access transparency. Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

34 Figure 12.9 NFS server operations (simplified) 1 lookup(dirfh, name) -> fh, attr create(dirfh, name, attr) -> newfh, attr remove(dirfh, name) status getattr(fh) -> attr setattr(fh, attr) -> attr read(fh, offset, count) -> attr, data write(fh, offset, count, data) -> attr rename(dirfh, name, todirfh, toname) -> status link(newdirfh, newname, dirfh, name) -> status Returns file handle and attributes for the file name in the directory dirfh. Creates a new file name in directory dirfh with attributes attr and returns the new file handle and attributes. Removes file name from directory dirfh. Returns file attributes of file fh. (Similar to the UNIX stat system call.) Sets the attributes (mode, user id, group id, size, access time and modify time of a file). Setting the size to 0 truncates the file. Returns up to count bytes of data from a file starting at offset. Also returns the latest attributes of the file. Writes count bytes of data to a file starting at offset. Returns the attributes of the file after the write has taken place. Changes the name of file name in directory dirfh to toname in directory to todirfh. Creates an entry newname in the directory newdirfh which refers to file name in the directory dirfh. Continues on next slide... Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

35 Figure 12.9 NFS server operations (simplified) 2 symlink(newdirfh, newname, string) -> status readlink(fh) -> string mkdir(dirfh, name, attr) -> newfh, attr rmdir(dirfh, name) -> status readdir(dirfh, cookie, count) -> entries statfs(fh) -> fsstats Creates an entry newname in the directory newdirfh of type symbolic link with the value string. The server does not interpret the string but makes a symbolic link file to hold it. Returns the string that is associated with the symbolic link file identified by fh. Creates a new directory name with attributes attr and returns the new file handle and attributes. Removes the empty directory name from the parent directory dirfh. Fails if the directory is not empty. Returns up to count bytes of directory entries from the directory dirfh. Each entry contains a file name, a file handle, and an opaque pointer to the next directory entry, called a cookie. The cookie is used in subsequent readdir calls to start reading from the following entry. If the value of cookie is 0, reads from the first entry in the directory. Returns file system information (such as block size, number of free blocks and so on) for the file system containing a file fh. The directory and file operations are integrated into a single service. Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

36 Figure Local and remote file systems accessible on an NFS client Server 1 (root) Client Server 2 (root) (root) export... vmunix usr nfs people Remote mount students x staff Remote mount users big jon bob... jim ann jane joe Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1; the file system mounted at /usr/staff in the client is actually the subtree located at /nfs/users in Server 2. A mount point is a particular point in the hierarchy. Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

37 Andrew File System Unlike NFS, the most important design goal is scalability. One enterprise AFS deployment at Morgan Stanley exceeds 25,000 clients To achieve scalability, whole files are cached in client nodes. Why does this help with scalability? We reduce client server interactions. A client cache would typically hold several hundreds of files most recently used on that computer. The cache is permanent, surviving reboots. When the client opens a file, the cache is examined and used if the file is available there. AFS provides no support for large shared databases or the updating of records within files that are shared between client systems 37

38 Andrew File System - Typical Scenario Modified from Coulouris If the client code tries to open a file the client cache is tried first. If not there, a server is located and the server is called for the file. The copy is stored on the client side and is opened. Subsequent reads and writes hit the copy on the client. When the client closes the file - if the files has changed it is sent back to the server. The client side copy is retained for possible more use. Consider UNIX commands and libraries copied to the client. Consider files only used by a single user. These last two cases only require weak consistency These last two cases represent the vast majority of cases. Gain: Your files are available from any workstation. Principle: Make the common case fast. See Amdahl s Law. Measurements show only 0.4% percent of changed files were updated by more than one user during one week. 38

39 Figure Distribution of processes in the Andrew File System Workstations User Venus program UNIX kernel Servers Vice UNIX kernel Venus User program UNIX kernel Network Vice Venus User program UNIX kernel UNIX kernel Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

40 Figure File name space seen by clients of AFS Local Shared / (root) tmp bin... vmunix cmu bin Symbolic links Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

41 Figure System call interception in AFS Workstation User program UNIX file system calls Non-local file operations Venus UNIX kernel UNIX file system Local disk Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

42 Figure Implementation of file system calls in AFS If a client closes and the file is changed then vice makes RPC calls on all other clients to cancel the callback promise. User process UNIX kernel Venus Net Vice open(filename, mode) read(filedescriptor, Buffer, length) write(filedescriptor, Buffer, length) If FileName refers to a file in shared file space, pass the request to Venus. Open the local file and return the file descriptor to the application. Perform a normal UNIX read operation on the local copy. Perform a normal UNIX write operation on the local copy. On restart of a close(filedescriptor) failed client (missed callbacks) Venus sends a cache validation request to Vice. Close the local copy and notify Venus that the file has been closed. Check list of files in local cache. If not present or there is no valid callback promise, send a request for the file to the Vice server that is custodian of the volume containing the file. Place the copy of the file in the local file system, enter its local name in the local cache list and return the local name to UNIX. If the local copy has been changed, send a copy to the Vice server that is the custodian of the file. Transfer a copy of the file and a callback promise to the workstation. Log the callback promise. Replace the file contents and send a callback to all other clients holding callback promises on the file. Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012 A callback promise is a token stored with the cached copy either valid or cancelled In the event that two clients both write and then close, the last writer wins. 42

43 Figure The main components of the Vice service interface Fetch(fid) -> attr, data Store(fid, attr, data) Create() -> fid Remove(fid) SetLock(fid, mode) ReleaseLock(fid) RemoveCallback(fid) BreakCallback(fid) Returns the attributes (status) and, optionally, the contents of file identified by the fid and records a callback promise on it. Updates the attributes and (optionally) the contents of a specified file. Creates a new file and records a callback promise on it. Deletes the specified file. Sets a lock on the specified file or directory. The mode of the lock may be shared or exclusive. Locks that are not removed expire after 30 minutes. Unlocks the specified file or directory. Informs server that a Venus process has flushed a file from its cache. This call is made by a Vice server to a Venus process. It cancels the callback promise on the relevant file. Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

44 CMU s Coda is an enhanced descendant of AFS Very briefly, two important features are: Disconnected operation for mobile computing. Continued operation during partial network failures in server network. During normal operation, a user reads and writes to the file system normally, while the client fetches, or "hoards", all of the data the user has listed as important in the event of network disconnection. If the network connection is lost, the Coda client's local cache serves data from this cache and logs all updates. Upon network reconnection, the client moves to reintegration state; it sends logged updates to the servers. From Wikipedia Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

45 Google File System (GFS) Hadoop (HDFS) What is Hadoop? Sort of the opposite of virtual machines where one machine may act like many. Instead, with Hadoop, many machines act as one. Hadoop is an open source implementation of GFS. Microsoft has Dryad with similar goals. At its core, an operating system (like Hadoop) is all about: (a) storing files (b) running applications on top of files From Introducing Apache Hadoop: The Modern Data Operating System, Amr Awadallah 45

46 Figure 21.3 Organization of the Google physical infrastructure (To avoid clutter the Ethernet connections are shown from only one of the clusters to the external links) Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012 Commodity PC s which are assumed to fail PC s per rack. Racks are organized into clusters. Each cluster >30 racks. Each PC has >2 terabytes. 30 racks is about 4.8 petabytes. All of Google > 1 exabyte (10^18 bytes) 46

47 Requirements of Google File System (GFS) Run reliably with component failures. Solve problems that Google needs solved not a massive number of files but massively large files are common. Write once, append, read many times. Access is dominated by long sequential streaming reads and sequential appends. No need for caching on the client. Throughput more important than latency. Think of very large files each holding a very large number of HTML documents scanned from the web. These need read and analyzed. This is not your everyday use of a distributed file system (NFS and AFS). Not POSIX. 47

48 GFS Each file is mapped to a set of fixed size chunks. Each chunk is 64Mb in size. Each cluster has a single master and multiple (usually hundreds) of chunk servers. Each chunk is replicated on three different chunk servers. The master knows the locations of chunk replicas. The chunk servers know what replicas they have and are polled by the master on startup. 48

49 Figure 21.9 Overall architecture of GFS Each GFS cluster has a single master. Manage metadata Hundreds of chunkservers Data is replicated on three independent chunkservers. Locations known by master. With log files, the master is restorable after failure. Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

50 GFS Reading a file sequentially Suppose a client wants to perform a sequential read, processing a very large file from a particular byte offset. 1) The client can compute the chunk index from the byte offset. 2) Client calls master with file name and chunk index. 3) Master returns chunk identifier and the locations of replicas. 4) Client makes call on a chunk server for the chunk and it is processed sequentially with no caching. It may ask for and receive several chunks. 50

51 GFS Mutation operations Suppose a client wants to perform sequential writes to the end of a file. 1) The client can compute the chunk index from the byte offset. This is the chunk holding End Of File. 2) Client calls master with file name and chunk index. 3) Master returns chunk identifier and the locations of replicas. One is designated as the primary. 4) The client sends all data to all replicas. The primary coordinates with replicas to update files consistently across replicas. 51

52 MapReduce 52

53 MapReduce Runs on Hadoop Provide a clean abstraction on top of parallelization and fault tolerance. Easy to program. The parallelization and fault tolerance is automatic. Google uses over 12,000 different MapReduce programs over a wide variety of problems. These are often pipelined. Programmer implements two interfaces: one for mappers and one for reducers. Map takes records from source in the form of key value pairs. The key might be a document name and the value a document. The key might be a file offset and the value a line of the file. Map produces one or more intermediate values along with an output key. When Map is complete, all of the intermediate values for a given output key are combined into a list. The combiners run on the mapper machines. 53

54 MapReduce Reduce combines the intermediate values into one or more final values for the same output key (usually one final value per key) The master tries to place the mapper on the same machines as the data or nearby. A mapper object is initialized for each map task. In configuring a job, the programmer provides only a hint on the number of mappers to run. The final decision depends on the physical layout of the data. A reducer object is initialized for each reduce task. The reduce method is called once per intermediate key. The programmer can specify precisely the number of reduce tasks. 54

55 MapReduce From the Google Paper Map (k 1,v 1 ) --> list(k 2,v 2 ) Reduce (k 2, list(v 2 )) --> list(k 3,v 3 ) All values associated with one key are brought together in the reducer. Final output is written to the distributed file system, one file per reducer. The output may be passed to another MapReduce program. 55

56 56

57 More detail 57

58 Figure Some examples of the use of MapReduce Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

59 Figure The overall execution of a MapReduce program Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

60 Overall Execution of MapReduce Mappers run on the input data scattered over n machines: Data on Disk 1 =>( key,value) => map 1 Data on Disk 2 => (key,value) => map 2 : Data on Disk n => (key,value) => map n The map tasks produce (key, value) pairs: map 1 => (key 1, value) (key 2, value) map 2 => (key 1, value) (key 2, value) (key 3, value) (key 1, value) The output of each map task is collected and sorted on the key. These key, value pairs are passed to the reducers: (key 1, value list) => reducer1 => list(value) (key 2, value list) => reducer2 => list(value) (key 3, value list) => reducer3 => list(value) Maps run in parallel. Reducers run in parallel. Map phase must be completely finished before the reduce phase can begin. The combiner phase is run on mapper nodes after map phase. This is a mini-reduce on local map output. For complex activities, best to pipe the output of a reducer to another mapper. 60

61 MapReduce to Count Word Occurrences in Many Documents Disk 1 => (Document name,document) => map 1 On machine near disk 1 Disk 2 => (Document name,document) => map 2 On machine near disk 2 Disk n => (Document name, Document) => map n map 1 => (ball, 1) (game, 1) map 2 => (ball, 1) (team, 1) (ball, 1) Gather map output and sort by key. Send these pairs to reducers. (ball, 1,1,1) => reducer => (ball, 3) (game, 1) => reducer => (game, 1) (team, 1) => reducer => (team, 1) 61

62 Some MapReduce Examples 1) Count the number of occurrences of each word in a large collection of documents. 2) Distributed GREP: Count the number of lines with a particular pattern. 3) From a web server log, determine URL access frequency. 4) Reverse a web link graph. For a given URL, find URL s of pages pointing to it. 5) For each word, create list of documents containing it. (Same as 4.) 6) Distributed sort of a lot of records with keys. 62

63 MapReduce Example (1) Count the number of occurrences of each word in a large collection of documents. // (K1,V1) à List(K2,V2) map(string key, String value) // key: document name // value: document contents for each word w in value emitintermediate(w, 1 ) ==================================================== // (K2, List(V2)) à List(V2) (bell,[1]), (car,[1,1]) reduce(string key, Iterator values) // key: a word // values: a list of counts result = 0 for each v in values result += v; emit(key, result) (bell,1),(car,2) Doc1 car bell Doc2 car (car,1),(bell,1),(car,1) 63

64 MapReduce Example (2) Distributed GREP: Count the number of lines with a particular pattern. Suppose searchstring is th. // (K1,V1) à List(K2,V2) map(fileoffset, linefromfile) if searchstring in linefromfile emitintermediate(linefromfile,1) // (K2, List(V2)) à List(V2) reduce (K2, iterator values) s = sum up values emit (s,k2) (0, the line) (8, a line) (14, the store) (22, the line) (the line, 1), (the store, 1), (the line,1) (the line, [1,1]), (the store,[1]) (2 the line),(1 the store) 64

65 MapReduce Example (3) From a web server log, determine URL access frequency. Web page request log: URL1 was visited URL1 was visted URL2 was visted URL1 was visted (0,URL1),(45,URL1),(90,URL2),(135,URL1) // (K1,V1) à List(K2,V2) map( offset, url) emitintermediate(url,1) // (K2, List(V2)) à List(V2) reduce(url, values) sum values into total emit(url,total) (URL1,1),(URL1,1),(URL2,1),(URL1,1) (URL1, [1,1,1]), (URL2, [1]) (URL1, 3),(URL2,1) 65

66 MapReduce Example (4) 4) Reverse a web link graph. For a given URL, find URL s of pages pointing to it. // (K1,V1) à List(K2,V2) map(string SourceDocURL, sourcedoc) for each target in sourcedoc emitintermediate(target, SourceDocURL) (URL1, {P1,P2,P3}) (URL2, {P1,P3}) (P1, URL1), (P2,URL1), (P3, URL1) (P1, URL2), (P3, URL2) // (K2, List(V2)) à List(V2) reduce(target, listofsourceurl s) emit(target, listofsourceurl s) (P1, (URL1, URL2)), (P2, (URL1)), (P3,(URL1,URL2)) 5) Same as 4. 66

67 MapReduce Example (6) 6) Distributed sort of a lot of records with keys. // (K1,V1) à List(K2,V2) (0, k2, data), (20, k1, data), (30, k3, data) map(offset, record) sk = find sort key in record emitintermediate(sk, record) // (K2, List(V2)) à List(V2) reduce emits records unchanged (k2,data),(k1,data),(k3,data) (k1,data),(k2,data),(k3,data) 67

68 Recall Example 1 Word Count Count the number of occurrences of each word in a large collection of documents. // (K1,V1) à List(K2,V2) map(string key, String value) // key: document name // value: document contents for each word w in value emitintermediate(w, 1 ) ==================================================== // (K2, List(V2)) à List(V2) (bell,[1]), (car,[1,1]) reduce(string key, Iterator values) // key: a word // values: a list of counts result = 0 for each v in values result += v; emit(key, result) (bell,1),(car,2) Doc1 car bell Doc2 car (car,1),(bell,1),(car,1) 68

69 Word Counting in Java - Mapper Using offset into file not document name public static class MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(longwritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.tostring(); StringTokenizer itr = new StringTokenizer(line); while (itr.hasmoretokens()) { word.set(itr.nexttoken()); output.collect(word, one); } }} Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

70 Word Counting in Java - Reducer public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasnext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } } Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education

71 Computing π Can you think of an embarrassingly parallel approach to approximating the value of π? Randomly throw one thousand darts at 100 square 1 X 1 boards, all with inscribed circles. Count the number of darts landing inside the circles and those landing outside. Compute the area A = (landing inside)/(landing inside + landing outside). We know that A = π r 2 = π (1/2) 2 = ¼ π. So, π = 4A. 71

Introduction. Distributed file system. File system modules. UNIX file system operations. File attribute record structure

Introduction. Distributed file system. File system modules. UNIX file system operations. File attribute record structure Introduction Distributed File Systems B.Ramamurthy 9/28/2004 B.Ramamurthy 1 Distributed file systems support the sharing of information in the form of files throughout the intranet. A distributed file

More information

Module 7 File Systems & Replication CS755! 7-1!

Module 7 File Systems & Replication CS755! 7-1! Module 7 File Systems & Replication CS755! 7-1! Distributed File Systems CS755! 7-2! File Systems File system! Operating System interface to disk storage! File system attributes (Metadata)! File length!

More information

Chapter 8: Distributed File Systems. Introduction File Service Architecture Sun Network File System The Andrew File System Recent advances Summary

Chapter 8: Distributed File Systems. Introduction File Service Architecture Sun Network File System The Andrew File System Recent advances Summary Chapter 8: Distributed File Systems Introduction File Service Architecture Sun Network File System The Andrew File System Recent advances Summary Introduction File system persistent storage Distributed

More information

Introduction. Chapter 8: Distributed File Systems

Introduction. Chapter 8: Distributed File Systems Chapter 8: Distributed File Systems Summary Introduction File system persistent storage Distributed file system persistent storage information sharing similar (in some case better) performance and reliability

More information

Distributed File Systems. File Systems

Distributed File Systems. File Systems Module 5 - Distributed File Systems File Systems File system Operating System interface to disk storage File system attributes (Metadata) File length Creation timestamp Read timestamp Write timestamp Attribute

More information

DFS Case Studies, Part 1

DFS Case Studies, Part 1 DFS Case Studies, Part 1 An abstract "ideal" model and Sun's NFS An Abstract Model File Service Architecture an abstract architectural model that is designed to enable a stateless implementation of the

More information

Chapter 12 Distributed File Systems. Copyright 2015 Prof. Amr El-Kadi

Chapter 12 Distributed File Systems. Copyright 2015 Prof. Amr El-Kadi Chapter 12 Distributed File Systems Copyright 2015 Prof. Amr El-Kadi Outline Introduction File Service Architecture Sun Network File System Recent Advances Copyright 2015 Prof. Amr El-Kadi 2 Introduction

More information

Chapter 8 Distributed File Systems

Chapter 8 Distributed File Systems CSD511 Distributed Systems 分散式系統 Chapter 8 Distributed File Systems 吳俊興 國立高雄大學資訊工程學系 Chapter 8 Distributed File Systems 8.1 Introduction 8.2 File service architecture 8.3 Case study: Sun Network File System

More information

UNIT V Distributed File System

UNIT V Distributed File System UNIT V Distributed File System 1 Figure 12.1 Storage systems and their properties Sharing Persistence Distributed cache/replicas Consistency maintenance Example Main memory File system Distributed file

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Distributed Systems Introduction File service architecture Sun Network File System (NFS) Andrew File System (AFS) Recent advances Summary Learning objectives Understand the requirements

More information

Distributed File Systems. CS432: Distributed Systems Spring 2017

Distributed File Systems. CS432: Distributed Systems Spring 2017 Distributed File Systems Reading Chapter 12 (12.1-12.4) [Coulouris 11] Chapter 11 [Tanenbaum 06] Section 4.3, Modern Operating Systems, Fourth Ed., Andrew S. Tanenbaum Section 11.4, Operating Systems Concept,

More information

Lecture 7: Distributed File Systems

Lecture 7: Distributed File Systems 06-06798 Distributed Systems Lecture 7: Distributed File Systems 5 February, 2002 1 Overview Requirements for distributed file systems transparency, performance, fault-tolerance,... Design issues possible

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Dr. Xiaobo Zhou Distributed Systems: Concepts and Design Edition 4, Addison-Wesley 2005 2/21/2011 1 Learning Objectives Understand the requirements that affect the design of distributed

More information

DFS Case Studies, Part 2. The Andrew File System (from CMU)

DFS Case Studies, Part 2. The Andrew File System (from CMU) DFS Case Studies, Part 2 The Andrew File System (from CMU) Case Study Andrew File System Designed to support information sharing on a large scale by minimizing client server communications Makes heavy

More information

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition,

Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition, Chapter 17: Distributed-File Systems, Silberschatz, Galvin and Gagne 2009 Chapter 17 Distributed-File Systems Outline of Contents Background Naming and Transparency Remote File Access Stateful versus Stateless

More information

Ch. 7 Distributed File Systems

Ch. 7 Distributed File Systems Ch. 7 Distributed File Systems File service architecture Network File System Coda file system Tanenbaum, van Steen: Ch 10 CoDoKi: Ch 8 1 File Systems Traditional tasks of a FS organizing, storing, accessing

More information

SOFTWARE ARCHITECTURE 11. DISTRIBUTED FILE SYSTEM.

SOFTWARE ARCHITECTURE 11. DISTRIBUTED FILE SYSTEM. 1 SOFTWARE ARCHITECTURE 11. DISTRIBUTED FILE SYSTEM Tatsuya Hagino hagino@sfc.keio.ac.jp lecture URL https://vu5.sfc.keio.ac.jp/slide/ 2 File Sharing Online Storage Use Web site for upload and download

More information

Lecture 14: Distributed File Systems. Contents. Basic File Service Architecture. CDK: Chapter 8 TVS: Chapter 11

Lecture 14: Distributed File Systems. Contents. Basic File Service Architecture. CDK: Chapter 8 TVS: Chapter 11 Lecture 14: Distributed File Systems CDK: Chapter 8 TVS: Chapter 11 Contents General principles Sun Network File System (NFS) Andrew File System (AFS) 18-Mar-11 COMP28112 Lecture 14 2 Basic File Service

More information

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space Today CSCI 5105 Coda GFS PAST Instructor: Abhishek Chandra 2 Coda Main Goals: Availability: Work in the presence of disconnection Scalability: Support large number of users Successor of Andrew File System

More information

Distributed Systems. Hajussüsteemid MTAT Distributed File Systems. (slides: adopted from Meelis Roos DS12 course) 1/25

Distributed Systems. Hajussüsteemid MTAT Distributed File Systems. (slides: adopted from Meelis Roos DS12 course) 1/25 Hajussüsteemid MTAT.08.024 Distributed Systems Distributed File Systems (slides: adopted from Meelis Roos DS12 course) 1/25 Examples AFS NFS SMB/CIFS Coda Intermezzo HDFS WebDAV 9P 2/25 Andrew File System

More information

NFS: Naming indirection, abstraction. Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency

NFS: Naming indirection, abstraction. Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency Local file systems Disks are terrible abstractions: low-level blocks, etc. Directories, files, links much

More information

Distributed File Systems. Distributed Computing Systems. Outline. Concepts of Distributed File System. Concurrent Updates. Transparency 2/12/2016

Distributed File Systems. Distributed Computing Systems. Outline. Concepts of Distributed File System. Concurrent Updates. Transparency 2/12/2016 Distributed File Systems Distributed Computing Systems Distributed File Systems Early networking and files Had FTP to transfer files Telnet to remote login to other systems with files But want more transparency!

More information

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Nov 28, 2017 Lecture 25: Distributed File Systems All slides IG

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Nov 28, 2017 Lecture 25: Distributed File Systems All slides IG CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy) Nov 28, 2017 Lecture 25: Distributed File Systems All slides IG File System Contains files and directories (folders) Higher level of

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09

Distributed File Systems. CS 537 Lecture 15. Distributed File Systems. Transfer Model. Naming transparency 3/27/09 Distributed File Systems CS 537 Lecture 15 Distributed File Systems Michael Swift Goal: view a distributed system as a file system Storage is distributed Web tries to make world a collection of hyperlinked

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Today l Basic distributed file systems l Two classical examples Next time l Naming things xkdc Distributed File Systems " A DFS supports network-wide sharing of files and devices

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Distributed File Systems 15 319, spring 2010 12 th Lecture, Feb 18 th Majd F. Sakr Lecture Motivation Quick Refresher on Files and File Systems Understand the importance

More information

Distributed File Systems. Directory Hierarchy. Transfer Model

Distributed File Systems. Directory Hierarchy. Transfer Model Distributed File Systems Ken Birman Goal: view a distributed system as a file system Storage is distributed Web tries to make world a collection of hyperlinked documents Issues not common to usual file

More information

Distributed File Systems I

Distributed File Systems I Distributed File Systems I To do q Basic distributed file systems q Two classical examples q A low-bandwidth file system xkdc Distributed File Systems Early DFSs come from the late 70s early 80s Support

More information

GFS: The Google File System

GFS: The Google File System GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one

More information

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3.

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3. CHALLENGES Transparency: Slide 1 DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems ➀ Introduction ➁ NFS (Network File System) ➂ AFS (Andrew File System) & Coda ➃ GFS (Google File System)

More information

AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi, Akshay Kanwar, Lovenish Saluja

AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi, Akshay Kanwar, Lovenish Saluja www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 2 Issue 10 October, 2013 Page No. 2958-2965 Abstract AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi,

More information

MI-PDB, MIE-PDB: Advanced Database Systems

MI-PDB, MIE-PDB: Advanced Database Systems MI-PDB, MIE-PDB: Advanced Database Systems http://www.ksi.mff.cuni.cz/~svoboda/courses/2015-2-mie-pdb/ Lecture 10: MapReduce, Hadoop 26. 4. 2016 Lecturer: Martin Svoboda svoboda@ksi.mff.cuni.cz Author:

More information

CS 138: Google. CS 138 XVII 1 Copyright 2016 Thomas W. Doeppner. All rights reserved.

CS 138: Google. CS 138 XVII 1 Copyright 2016 Thomas W. Doeppner. All rights reserved. CS 138: Google CS 138 XVII 1 Copyright 2016 Thomas W. Doeppner. All rights reserved. Google Environment Lots (tens of thousands) of computers all more-or-less equal - processor, disk, memory, network interface

More information

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University CPSC 426/526 Cloud Computing Ennan Zhai Computer Science Department Yale University Recall: Lec-7 In the lec-7, I talked about: - P2P vs Enterprise control - Firewall - NATs - Software defined network

More information

CS6030 Cloud Computing. Acknowledgements. Today s Topics. Intro to Cloud Computing 10/20/15. Ajay Gupta, WMU-CS. WiSe Lab

CS6030 Cloud Computing. Acknowledgements. Today s Topics. Intro to Cloud Computing 10/20/15. Ajay Gupta, WMU-CS. WiSe Lab CS6030 Cloud Computing Ajay Gupta B239, CEAS Computer Science Department Western Michigan University ajay.gupta@wmich.edu 276-3104 1 Acknowledgements I have liberally borrowed these slides and material

More information

Distributed Systems 16. Distributed File Systems II

Distributed Systems 16. Distributed File Systems II Distributed Systems 16. Distributed File Systems II Paul Krzyzanowski pxk@cs.rutgers.edu 1 Review NFS RPC-based access AFS Long-term caching CODA Read/write replication & disconnected operation DFS AFS

More information

CS 138: Google. CS 138 XVI 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

CS 138: Google. CS 138 XVI 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. CS 138: Google CS 138 XVI 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. Google Environment Lots (tens of thousands) of computers all more-or-less equal - processor, disk, memory, network interface

More information

CA485 Ray Walshe Google File System

CA485 Ray Walshe Google File System Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage

More information

Parallel Programming Principle and Practice. Lecture 10 Big Data Processing with MapReduce

Parallel Programming Principle and Practice. Lecture 10 Big Data Processing with MapReduce Parallel Programming Principle and Practice Lecture 10 Big Data Processing with MapReduce Outline MapReduce Programming Model MapReduce Examples Hadoop 2 Incredible Things That Happen Every Minute On The

More information

GFS: The Google File System. Dr. Yingwu Zhu

GFS: The Google File System. Dr. Yingwu Zhu GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can

More information

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files Addressable by a filename ( foo.txt ) Usually supports hierarchical

More information

Map-Reduce. Marco Mura 2010 March, 31th

Map-Reduce. Marco Mura 2010 March, 31th Map-Reduce Marco Mura (mura@di.unipi.it) 2010 March, 31th This paper is a note from the 2009-2010 course Strumenti di programmazione per sistemi paralleli e distribuiti and it s based by the lessons of

More information

Distributed Filesystem

Distributed Filesystem Distributed Filesystem 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributing Code! Don t move data to workers move workers to the data! - Store data on the local disks of nodes in the

More information

Network File Systems

Network File Systems Network File Systems CS 240: Computing Systems and Concurrency Lecture 4 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Abstraction, abstraction, abstraction!

More information

Big Data Management and NoSQL Databases

Big Data Management and NoSQL Databases NDBI040 Big Data Management and NoSQL Databases Lecture 2. MapReduce Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Framework A programming model

More information

CSE 486/586: Distributed Systems

CSE 486/586: Distributed Systems CSE 486/586: Distributed Systems Distributed Filesystems Ethan Blanton Department of Computer Science and Engineering University at Buffalo Distributed Filesystems This lecture will explore network and

More information

CSE 124: Networked Services Fall 2009 Lecture-19

CSE 124: Networked Services Fall 2009 Lecture-19 CSE 124: Networked Services Fall 2009 Lecture-19 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 Some of these slides are adapted from various sources/individuals including but

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 1: Distributed File Systems GFS (The Google File System) 1 Filesystems

More information

The Google File System (GFS)

The Google File System (GFS) 1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints

More information

MapReduce. U of Toronto, 2014

MapReduce. U of Toronto, 2014 MapReduce U of Toronto, 2014 http://www.google.org/flutrends/ca/ (2012) Average Searches Per Day: 5,134,000,000 2 Motivation Process lots of data Google processed about 24 petabytes of data per day in

More information

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5. Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message

More information

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Distributed Systems Lec 10: Distributed File Systems GFS Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 1 Distributed File Systems NFS AFS GFS Some themes in these classes: Workload-oriented

More information

Parallel Programming Concepts

Parallel Programming Concepts Parallel Programming Concepts MapReduce Frank Feinbube Source: MapReduce: Simplied Data Processing on Large Clusters; Dean et. Al. Examples for Parallel Programming Support 2 MapReduce 3 Programming model

More information

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following: CS 470 Spring 2017 Mike Lam, Professor Distributed Web and File Systems Content taken from the following: "Distributed Systems: Principles and Paradigms" by Andrew S. Tanenbaum and Maarten Van Steen (Chapters

More information

CS November 2017

CS November 2017 Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account

More information

ECE5610/CSC6220 Introduction to Parallel and Distribution Computing. Lecture 6: MapReduce in Parallel Computing

ECE5610/CSC6220 Introduction to Parallel and Distribution Computing. Lecture 6: MapReduce in Parallel Computing ECE5610/CSC6220 Introduction to Parallel and Distribution Computing Lecture 6: MapReduce in Parallel Computing 1 MapReduce: Simplified Data Processing Motivation Large-Scale Data Processing on Large Clusters

More information

CS 537: Introduction to Operating Systems Fall 2015: Midterm Exam #4 Tuesday, December 15 th 11:00 12:15. Advanced Topics: Distributed File Systems

CS 537: Introduction to Operating Systems Fall 2015: Midterm Exam #4 Tuesday, December 15 th 11:00 12:15. Advanced Topics: Distributed File Systems CS 537: Introduction to Operating Systems Fall 2015: Midterm Exam #4 Tuesday, December 15 th 11:00 12:15 Advanced Topics: Distributed File Systems SOLUTIONS This exam is closed book, closed notes. All

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW

More information

CS 345A Data Mining. MapReduce

CS 345A Data Mining. MapReduce CS 345A Data Mining MapReduce Single-node architecture CPU Machine Learning, Statistics Memory Classical Data Mining Disk Commodity Clusters Web data sets can be very large Tens to hundreds of terabytes

More information

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following: CS 470 Spring 2018 Mike Lam, Professor Distributed Web and File Systems Content taken from the following: "Distributed Systems: Principles and Paradigms" by Andrew S. Tanenbaum and Maarten Van Steen (Chapters

More information

BigData and Map Reduce VITMAC03

BigData and Map Reduce VITMAC03 BigData and Map Reduce VITMAC03 1 Motivation Process lots of data Google processed about 24 petabytes of data per day in 2009. A single machine cannot serve all the data You need a distributed system to

More information

CSE 124: Networked Services Lecture-16

CSE 124: Networked Services Lecture-16 Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

CLOUD-SCALE FILE SYSTEMS

CLOUD-SCALE FILE SYSTEMS Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients

More information

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures GFS Overview Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures Interface: non-posix New op: record appends (atomicity matters,

More information

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017 Hadoop File System 1 S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y Moving Computation is Cheaper than Moving Data Motivation: Big Data! What is BigData? - Google

More information

Distributed File Systems. Case Studies: Sprite Coda

Distributed File Systems. Case Studies: Sprite Coda Distributed File Systems Case Studies: Sprite Coda 1 Sprite (SFS) Provides identical file hierarchy to all users Location transparency Pathname lookup using a prefix table Lookup simpler and more efficient

More information

Distributed File Systems (Chapter 14, M. Satyanarayanan) CS 249 Kamal Singh

Distributed File Systems (Chapter 14, M. Satyanarayanan) CS 249 Kamal Singh Distributed File Systems (Chapter 14, M. Satyanarayanan) CS 249 Kamal Singh Topics Introduction to Distributed File Systems Coda File System overview Communication, Processes, Naming, Synchronization,

More information

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS By HAI JIN, SHADI IBRAHIM, LI QI, HAIJUN CAO, SONG WU and XUANHUA SHI Prepared by: Dr. Faramarz Safi Islamic Azad

More information

Outline. INF3190:Distributed Systems - Examples. Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles

Outline. INF3190:Distributed Systems - Examples. Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles INF3190:Distributed Systems - Examples Thomas Plagemann & Roman Vitenberg Outline Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles Today: Examples Googel File System (Thomas)

More information

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google 2017 fall DIP Heerak lim, Donghun Koo 1 Agenda Introduction Design overview Systems interactions Master operation Fault tolerance

More information

DISTRIBUTED OBJECTS AND REMOTE INVOCATION

DISTRIBUTED OBJECTS AND REMOTE INVOCATION DISTRIBUTED OBJECTS AND REMOTE INVOCATION Introduction This chapter is concerned with programming models for distributed applications... Familiar programming models have been extended to apply to distributed

More information

Clustering Lecture 8: MapReduce

Clustering Lecture 8: MapReduce Clustering Lecture 8: MapReduce Jing Gao SUNY Buffalo 1 Divide and Conquer Work Partition w 1 w 2 w 3 worker worker worker r 1 r 2 r 3 Result Combine 4 Distributed Grep Very big data Split data Split data

More information

Cloud Computing CS

Cloud Computing CS Cloud Computing CS 15-319 Distributed File Systems and Cloud Storage Part I Lecture 12, Feb 22, 2012 Majd F. Sakr, Mohammad Hammoud and Suhail Rehman 1 Today Last two sessions Pregel, Dryad and GraphLab

More information

Distributed Information Processing

Distributed Information Processing Distributed Information Processing 5 th Lecture Eom, Hyeonsang ( 엄현상 ) Department of Computer Science & Engineering Seoul National University Copyrights 2017 Eom, Hyeonsang All Rights Reserved Outline

More information

The Google File System. Alexandru Costan

The Google File System. Alexandru Costan 1 The Google File System Alexandru Costan Actions on Big Data 2 Storage Analysis Acquisition Handling the data stream Data structured unstructured semi-structured Results Transactions Outline File systems

More information

416 Distributed Systems. Distributed File Systems 1: NFS Sep 18, 2018

416 Distributed Systems. Distributed File Systems 1: NFS Sep 18, 2018 416 Distributed Systems Distributed File Systems 1: NFS Sep 18, 2018 1 Outline Why Distributed File Systems? Basic mechanisms for building DFSs Using NFS and AFS as examples NFS: network file system AFS:

More information

4/9/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data Spring 2018 Colorado State University. FAQs. Architecture of GFS

4/9/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data Spring 2018 Colorado State University. FAQs. Architecture of GFS W13.A.0.0 CS435 Introduction to Big Data W13.A.1 FAQs Programming Assignment 3 has been posted PART 2. LARGE SCALE DATA STORAGE SYSTEMS DISTRIBUTED FILE SYSTEMS Recitations Apache Spark tutorial 1 and

More information

Filesystems Lecture 11

Filesystems Lecture 11 Filesystems Lecture 11 Credit: Uses some slides by Jehan-Francois Paris, Mark Claypool and Jeff Chase DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh,

More information

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP TITLE: Implement sort algorithm and run it using HADOOP PRE-REQUISITE Preliminary knowledge of clusters and overview of Hadoop and its basic functionality. THEORY 1. Introduction to Hadoop The Apache Hadoop

More information

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

! Design constraints.  Component failures are the norm.  Files are huge by traditional standards. ! POSIX-like Cloud background Google File System! Warehouse scale systems " 10K-100K nodes " 50MW (1 MW = 1,000 houses) " Power efficient! Located near cheap power! Passive cooling! Power Usage Effectiveness = Total

More information

CS November 2018

CS November 2018 Bigtable Highly available distributed storage Distributed Systems 19. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account

More information

Distributed file systems

Distributed file systems Distributed file systems Vladimir Vlassov and Johan Montelius KTH ROYAL INSTITUTE OF TECHNOLOGY What s a file system Functionality: persistent storage of files: create and delete manipulating a file: read

More information

Today s Objec2ves. AWS/MR Review Final Projects Distributed File Systems. Nov 3, 2017 Sprenkle - CSCI325

Today s Objec2ves. AWS/MR Review Final Projects Distributed File Systems. Nov 3, 2017 Sprenkle - CSCI325 Today s Objec2ves AWS/MR Review Final Projects Distributed File Systems Nov 3, 2017 Sprenkle - CSCI325 1 Inverted Index final input files have been posted Another email out to AWS Google cloud Nov 3, 2017

More information

CS 470 Spring Parallel Algorithm Development. (Foster's Methodology) Mike Lam, Professor

CS 470 Spring Parallel Algorithm Development. (Foster's Methodology) Mike Lam, Professor CS 470 Spring 2018 Mike Lam, Professor Parallel Algorithm Development (Foster's Methodology) Graphics and content taken from IPP section 2.7 and the following: http://www.mcs.anl.gov/~itf/dbpp/text/book.html

More information

Google File System (GFS) and Hadoop Distributed File System (HDFS)

Google File System (GFS) and Hadoop Distributed File System (HDFS) Google File System (GFS) and Hadoop Distributed File System (HDFS) 1 Hadoop: Architectural Design Principles Linear scalability More nodes can do more work within the same time Linear on data size, linear

More information

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University CS 555: DISTRIBUTED SYSTEMS [RPC & DISTRIBUTED OBJECTS] Shrideep Pallickara Computer Science Colorado State University Frequently asked questions from the previous class survey XDR Standard serialization

More information

Map Reduce. Yerevan.

Map Reduce. Yerevan. Map Reduce Erasmus+ @ Yerevan dacosta@irit.fr Divide and conquer at PaaS 100 % // Typical problem Iterate over a large number of records Extract something of interest from each Shuffle and sort intermediate

More information

NPTEL Course Jan K. Gopinath Indian Institute of Science

NPTEL Course Jan K. Gopinath Indian Institute of Science Storage Systems NPTEL Course Jan 2012 (Lecture 39) K. Gopinath Indian Institute of Science Google File System Non-Posix scalable distr file system for large distr dataintensive applications performance,

More information

CSE Lecture 11: Map/Reduce 7 October Nate Nystrom UTA

CSE Lecture 11: Map/Reduce 7 October Nate Nystrom UTA CSE 3302 Lecture 11: Map/Reduce 7 October 2010 Nate Nystrom UTA 378,000 results in 0.17 seconds including images and video communicates with 1000s of machines web server index servers document servers

More information

Introduction to HDFS and MapReduce

Introduction to HDFS and MapReduce Introduction to HDFS and MapReduce Who Am I - Ryan Tabora - Data Developer at Think Big Analytics - Big Data Consulting - Experience working with Hadoop, HBase, Hive, Solr, Cassandra, etc. 2 Who Am I -

More information

Introduction to MapReduce

Introduction to MapReduce Basics of Cloud Computing Lecture 4 Introduction to MapReduce Satish Srirama Some material adapted from slides by Jimmy Lin, Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google Distributed

More information

The MapReduce Abstraction

The MapReduce Abstraction The MapReduce Abstraction Parallel Computing at Google Leverages multiple technologies to simplify large-scale parallel computations Proprietary computing clusters Map/Reduce software library Lots of other

More information

HDFS: Hadoop Distributed File System. Sector: Distributed Storage System

HDFS: Hadoop Distributed File System. Sector: Distributed Storage System GFS: Google File System Google C/C++ HDFS: Hadoop Distributed File System Yahoo Java, Open Source Sector: Distributed Storage System University of Illinois at Chicago C++, Open Source 2 System that permanently

More information

Remote Procedure Call. Tom Anderson

Remote Procedure Call. Tom Anderson Remote Procedure Call Tom Anderson Why Are Distributed Systems Hard? Asynchrony Different nodes run at different speeds Messages can be unpredictably, arbitrarily delayed Failures (partial and ambiguous)

More information

Google Cluster Computing Faculty Training Workshop

Google Cluster Computing Faculty Training Workshop Google Cluster Computing Faculty Training Workshop Module VI: Distributed Filesystems This presentation includes course content University of Washington Some slides designed by Alex Moschuk, University

More information

Clustering Documents. Document Retrieval. Case Study 2: Document Retrieval

Clustering Documents. Document Retrieval. Case Study 2: Document Retrieval Case Study 2: Document Retrieval Clustering Documents Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade April, 2017 Sham Kakade 2017 1 Document Retrieval n Goal: Retrieve

More information

Chapter 18 Distributed Systems and Web Services

Chapter 18 Distributed Systems and Web Services Chapter 18 Distributed Systems and Web Services Outline 18.1 Introduction 18.2 Distributed File Systems 18.2.1 Distributed File System Concepts 18.2.2 Network File System (NFS) 18.2.3 Andrew File System

More information

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia MapReduce Spark Some slides are adapted from those of Jeff Dean and Matei Zaharia What have we learnt so far? Distributed storage systems consistency semantics protocols for fault tolerance Paxos, Raft,

More information