CS 537: Introduction to Operating Systems Fall 2015: Midterm Exam #4 Tuesday, December 15 th 11:00 12:15. Advanced Topics: Distributed File Systems
|
|
- Abraham Carpenter
- 10 months ago
- Views:
Transcription
1 CS 537: Introduction to Operating Systems Fall 2015: Midterm Exam #4 Tuesday, December 15 th 11:00 12:15 Advanced Topics: Distributed File Systems SOLUTIONS This exam is closed book, closed notes. All cell phones must be turned off. No calculators may be used. You have 1 hour and 15 minutes to complete this exam. The exam has a total of 64 questions. Write all of your answers on the accu-scan form with a #2 pencil. These exam questions must be returned at the end of the exam, but we will not grade anything in this booklet. Please fill in the accu-scan form with your Last Name, First Name and Student Identification Number; remember to fill in the corresponding bubbles as well. This exam has multiple versions. To make sure you are graded with the correct answer key, you must identify this exam version with a Special Code in Column A on your accu-tron sheet. Be sure to fill in the corresponding bubble as well. Your special code is shown on the next page. Good luck!
2 This exam has multiple versions. To make sure you are graded with the correct answer key, you must identify this exam version with a Special Code in Column A on your accu-tron sheet. Be sure to fill in the corresponding bubble as well. Your special code is: 1. Part 1: Which Communication Protocol is this? [3 points each] QUESTIONS on exam #2 For each of the following questions, designate if the statement is True for (a) only the UDP protocol (b) only the TCP protocol (c) both UDP and TCP (d) neither UDP nor TCP. 1. If using this protocol, a message may be lost between the sending process and the receiving process and as a result the receiving process may never receive that message. Answer: a. The UDP protocol does not do anything to resend lost messages, but TCP will resend until an ACK is received. 2. If using this protocol, a message may be corrupted between the sending process and the receiving process and as a result the receiving process may receive a corrupted message. Answer: d. Neither protocol will allow a corrupted message to be delivered to a receiving process; both protocols use checksums to ensure that corrupted messages are not delivered to the receiving process. 3. With this protocol, if a sender does not receive an acknowledgement for a message, it means the receiver did not receive the intended message. Answer: d. UDP does not use acknowledgements; although TCP does use acknowledgements, not receiving one could mean that the ACK was dropped and the receiver did still receive the intended message. 4. With this protocol, if a sender does not receive an acknowledgement for a message, the sender will resend that message. Answer: b. TCP resends messages if an ack is not received. 5. With this protocol, the receiver is able to discard duplicate messages by tracking the checksum for each previously received message. Question discarded. The question meant to ask whether or not this what either protocol does (answer: neither do), but the words is able in the question made it sound like the question was asking what could instead be possible. 6. RPC can be implemented on top of this protocol. Answer: c. RPC can be built on top of either protocol. If it is built on UDP, the RPC protocol must do the extra work to ensure it discards duplicates and resends lost messages. If it is built on TCP,
3 there is some inefficiency because TCP sends ACKs back when RPC is going to be sending back a reply very soon. Both approaches have their pros and cons. Part 2: Which Distributed File System is this? [3 points each] Questions on exam #2. For each of the following questions, designate if the statement is True for (a) only NFS (b) only AFS (c) both NFS and AFS (d) neither NFS nor AFS. Assume NFS refers to NFS version 2 (or NFSv2) which is the NFS protocol we discussed in lecture. 7. With this file system, clients can access files either in their local file system or on a remote server. Answer: c. Both file systems work with the mount protocol so that clients can transparently access files in their directory namespace that might be located in different file systems, whether a local file system (e.g., ext3) or a distributed file system using a remote server (e.g., NFS or AFS). 8. Files and directories may be easily moved from one server to another. Answer: b. AFS allows volumes (subtrees of the directory space) to be moved to a new server (e.g., for better load balancing across servers). 9. Servers in this file system are stateless. Answer: a. NFS servers do not remember any state about particular clients that are interacting with it or on-going operations. AFS does track state: callbacks for clients that currently have each file open. 10. Requires that clients fetch the entire file when they open a file (unless they already have that file cached locally). Answer: b. NFS does not mandate any reading or prefetching when a file is opened; AFS does. 11. Requires that clients push their write operations immediately to the server. Answer: d. Neither file system requires clients to push their write operations immediately; both require writes to be pushed out when the file is closed. 12. Writes to one file from one client can be intermingled with writes to that same file from another client. Answer: a. If two clients A and B both write to two blocks 0 and 1 from the same file, with NFS it is possible for client A s block 0 and client B s block 1 to end up being the final version of that file; AFS guarantees that the server will store either A s two blocks or B s two blocks. 13. Clients are guaranteed to read and write to the same version of a file for the entire time between when that client opens and closes that file.
4 Answer: b. NFS will re-read blocks of a file after that file has been opened if the file is changed on the server (and the attribute cache has expired); AFS does not re-read blocks of a file after that file is opened. 14. Increasing the time between getattr (or STAT) calls to the server increases the scalability of the system. Answer: a. NFS uses getattr calls; if the time between getattr s is too short, the server is flooded with these getattr requests and becomes a bottleneck for the entire distributed system (limiting the number of clients that can use one server); by increasing the time between requests, the server can handle more clients. 15. If the server does not contain sufficient memory to track the callbacks for a particular file, the server can simply discard (i.e., forget) those callbacks. Answer: d. Although AFS does use callbacks, the server cannot just throw callbacks away; if it needs to discard a callback for any reason, the server needs to break that callback and notify the client so the client will not assume it can trust its cached contents later.
5 Part 3: True/False for MapReduce and GFS [3 points each] Questions 1-23 on exam #2. Designate if the statement is True (a) or False (b). For the GFS file system, assume a replication level of 3 is always used. 16. An idempotent operation has the same result whether it is executed zero, one, or more times. Answer: false. Idempotent operations give the same result when executed one or more times. 17. With the MapReduce programming framework, each of the M mappers is responsible for reading sequentially through 1/M-th of the amount of input data. Answer: true. Each mapper is assigned a sequential contiguous portion of the input data (hopefully a chunk in GFS). 18. With MapReduce, each of the M mappers uses RPC to directly send partitioned data to each of the R reducers. Answer: false; each of the R reducers READs their portion of the data from all of the mappers that generated the data (the intermediate generated data may be stored on the mapper s local disk). 19. With MapReduce, each of the R reducers is responsible for writing sequentially to 1/R-th of the amount of output data. Answer: false; different reducers may write much more or much less than 1/Rth of the output data; for example, one reducer may be responsible for records matching WI and another for records matching MN, and there may be many more WI than MN records. 20. One worker machine may be responsible for running multiple mappers within a single MapReduce job. Answer: true. In most cases, there will be many more mappers than worker machines; when a worker finishes one mapper, it moves on to executing a different mapper; this improves load balancing (since some mappers may take longer to run than others). 21. GFS is used to store input, intermediate, and output files for MapReduce jobs. Answer: false. GFS is only used for input and output files; intermediate files are just stored on the local disk. 22. Each mapper function must perform network I/O in order to read its input data. Answer: false. Ideally, a mapper is run on a machine storing one of the replicas of its input data (stored in GFS). 23. Each reduce function must perform network I/O in order to read its input data. Answer: true; each reducer must contact every machine that ran a mapper to fetch the appropriate data for this reducer.
6 24. Each reduce function must perform network I/O in order to write its output data. Answer: true; with 3 replicas in GFS, at least two of the replicas will be on other machines. 25. Increasing the value of M increases the amount of parallelism in the MapReduce job. Answer: true; as M increases, there are more mappers, which can each be run in parallel. 26. If a reducer fails, that reducer is run again, as are the mappers responsible for generating the input for that reducer. Answer: false; if a reducer fails, the input to the reducer still exists (on the machines that ran the mappers) and the new reducer can simply refetch that input. 27. If a mapper runs slowly, the master can start a duplicate mapper and still obtain correct results. Answer: true. The mapping function is idempotent; re-run it and it will generate the same results. 28. If a reducer runs slowly, the master can start a duplicate reducer and still obtain correct results. Answer: true. The reduce function is idempotent; re-run it and it will generate the same results. 29. MapReduce and other GFS- specific workloads often overwrite previously written data. Answer: false; the workloads append a lot and very rarely overwrite older data. 30. MapReduce workloads require high I/O throughput. Answer: true; the workloads don t require low latency, but they do need to read and write a lot of data in each unit of time. 31. In GFS, the scalability of the system is likely to increase as the size of chunk is increased. Answer: true; as the chunk size is increased, the master has fewer entries to track and thus it can handle more files. 32. In GFS, clients must communicate with the master every time they read or write to a file. Answer: false; clients communicate with the master only when they open a file; at that point, the client has obtained all the meta-data for the file (i.e., the chunk ids and the chunk servers for those chunks) and the client can thereafter communicate directly with the chunkservers. 33. In GFS, clients may only read from the primary replica. Answer: false; clients can read from any of the replicas. 34. In GFS, the master is responsible for mapping file names to logical chunk numbers and their chunk servers. Question discarded; although the master does track all of this information, the word responsible in the question could imply that the master has the most-accurate knowledge of all of this; however, only the chunk servers themselves know for certain which chunks they hold. 35. In GFS, the master persists updates to the file name space to the disk on other backup masters.
7 Answer: true; the mapping between file names and chunk ids is updated persistently so that another master can take over if this one dies. 36. In GFS, the master persists updates to the chunk map to the disk on other backup masters. Answer: false; this information is kept only in memory; if the master dies, this information is recreated by asking each chunk server for the valid chunk ids that it hosts. 37. In GFS, the master is responsible for determining the order in which different clients update replicas. Answer: false; the primary replica determines this order (so that the master is not a bottleneck). 38. In GFS, it is possible that only a subset of a chunk s replicas successfully performs an append operation. Question discarded, because we didn t cover this in lecture thoroughly enough. Answer: true; the primary tells the secondary replicas the order in which appends from different clients should be performed, but it is possible that some of those replicas will run into problems and the primary is not responsible for ensuring each replica successfully performed the append; the primary simply notifies the client of these errors and the client is responsible for retrying the append.
8 Part 4. Distributed File System Consistency [2 points each] The next questions explore the cache consistency behavior of AFS and NFS. The two traces on the next two pages are identical; you should use one trace for answering how AFS behaves and one trace for answering how NFS behaves. Each trace contains three clients that each generate file opens, reads, writes, and closes on a single file a. The leftmost column shows the server; the next 3 columns show the actions being taken on each of the three clients, c0, c1, and c2. The content of the file is always just a single number. Time increases downwards, with at least 5 seconds between each operation. Opening a file returns a file descriptor, which is the first argument to each call of read, write, and close (i.e., read:fd, write:fd, and close:fd). The write call also designates the new value to be written (i.e., write:fd -> newvalue). There are two types of questions for you to answer. Questions of the format read:fd -> value? on c0, c1, and c2 ask you to determine the value that will be read on that client. Questions of the format file:a contains:? on the server ask you to determine the value on the server at that point in time. You may find it useful to record the contents of the file on the server at every interesting point in time. For each protocol, assume that the server and clients have sufficient memory such that no operations are performed unless required by the protocol. For questions Q39-Q44, your choices are: a) 0 b) 1 c) 2 d) 3 For questions Q45-Q51, your choices are: a) 2 b) 3 c) 4 d) 5 For questions Q52-Q57 your choices are: a) 0 b) 1 c) 2 d) 3 For questions Q58-Q64, your choices are: a) 2 b) 3 c) 4 d) 5
9 Questions 39-51: AFS Protocol Remember, for questions Q39-Q44, your choices are: a) 0 b) 1 c) 2 d) 3 For questions Q45-Q51, your choices are: a) 2 b) 3 c) 4 d) 5 Determine the results if this workload is run on top of the AFS distributed file system. Server c0 c1 c2 file:a contains:0 open:a [fd:0] 0 until close open:a [fd:0] 0 until open:a [fd:0] 0 until close read:0 -> (Q39) 0 (a) write:0 -> 1 read:0 ->(Q40) 0 (a) write:0 -> 2 read:0 -> (Q41) 0 (a) write:0 -> 3 file:a contains:?(q42) 0 (a) file:a contains:?(q43) 1 (b) close:0 flushes 2 close:0 flushes 1 open:a [fd:1] 2 until read:1 -> (Q44) 2 c write:1 -> 4 close:0 flushes 3 open:a [fd:1] 3 until close open:a [fd:1] 3 until close close:1 flushes 4 file:a contains:?(q45) (Note: Switch to different multiple choice options) 4(c) read:1 -> value?(q46) 3 (b) open:a [fd:2] 4 close:1 read:1 -> value?(q47) 3(b) write:1 -> 5 open:a [fd:2] 4 until close read:2 -> value?(q49) 4c read:2 ->(Q48) 4c close:2 close:1 flushes 5 read:2 -> value?(q50) 4c close:2 file:a contains:?(q51) 5d
10 Questions 52-64: NFS Protocol For questions Q52-Q57 your choices are: a) 0 b) 1 c) 2 d) 3 For questions Q58-Q64, your choices are: a) 2 b) 3 c) 4 d) 5 Determine the results if this workload is run on top of the NFS distributed file system. Server c0 c1 c2 file:a contains:0 open:a [fd:0] open:a [fd:0] open:a [fd:0] read:0 -> (Q52) 0a write:0 -> 1 read:0 -> (Q53) 0a write:0 -> 2 read:0 -> Q54 0a write:0 -> 3 file:a contains:?(q55) 0a close:0 flushes 1 file:a contains:?(q56) 1b close:0 flushes 2 open:a [fd:1] read:1 -> (Q57) 2c write:1 -> 4 close:0 flushes 3 open:a [fd:1] open:a [fd:1] close:1 flushes 4 file:a contains:?(q58) (Note: Switch to different multiple choice) 4c read:1 -> value?(q59) 4c open:a [fd:2] close:1 read:1 -> Q60 4c write:1 -> 5 open:a [fd:2] read:2 -> Q62-4c read:2 -> (Q61) -4c close:2 close:1 flushes 5 read:2 -> value?(q63) 5d close:2 file:a contains:?(q64) -5d Congratulations on learning a lot about Operating Systems this semester! Please double check that specified your Special Code in Column A on your accu-tron sheet.
Announcements. P4: Graded Will resolve all Project grading issues this week P5: File Systems
Announcements P4: Graded Will resolve all Project grading issues this week P5: File Systems Test scripts available Due Due: Wednesday 12/14 by 9 pm. Free Extension Due Date: Friday 12/16 by 9pm. Extension
! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like
Cloud background Google File System! Warehouse scale systems " 10K-100K nodes " 50MW (1 MW = 1,000 houses) " Power efficient! Located near cheap power! Passive cooling! Power Usage Effectiveness = Total
NFS: Naming indirection, abstraction. Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency
Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency Local file systems Disks are terrible abstractions: low-level blocks, etc. Directories, files, links much
Distributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
The Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
CA485 Ray Walshe Google File System
Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage
The Google File System
The Google File System By Ghemawat, Gobioff and Leung Outline Overview Assumption Design of GFS System Interactions Master Operations Fault Tolerance Measurements Overview GFS: Scalable distributed file
CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.
Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message
The Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in
Distributed File Systems. CS432: Distributed Systems Spring 2017
Distributed File Systems Reading Chapter 12 (12.1-12.4) [Coulouris 11] Chapter 11 [Tanenbaum 06] Section 4.3, Modern Operating Systems, Fourth Ed., Andrew S. Tanenbaum Section 11.4, Operating Systems Concept,
The Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW
Lecture 14: Distributed File Systems. Contents. Basic File Service Architecture. CDK: Chapter 8 TVS: Chapter 11
Lecture 14: Distributed File Systems CDK: Chapter 8 TVS: Chapter 11 Contents General principles Sun Network File System (NFS) Andrew File System (AFS) 18-Mar-11 COMP28112 Lecture 14 2 Basic File Service
Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897
Flat Datacenter Storage Edmund B. Nightingale, Jeremy Elson, et al. 6.S897 Motivation Imagine a world with flat data storage Simple, Centralized, and easy to program Unfortunately, datacenter networks
Network File System (NFS)
Network File System (NFS) Brad Karp UCL Computer Science CS GZ03 / M030 14 th October 2015 NFS Is Relevant Original paper from 1985 Very successful, still widely used today Early result; much subsequent
Google File System (GFS) and Hadoop Distributed File System (HDFS)
Google File System (GFS) and Hadoop Distributed File System (HDFS) 1 Hadoop: Architectural Design Principles Linear scalability More nodes can do more work within the same time Linear on data size, linear
Distributed File Systems I
Distributed File Systems I To do q Basic distributed file systems q Two classical examples q A low-bandwidth file system xkdc Distributed File Systems Early DFSs come from the late 70s early 80s Support
CS 345A Data Mining. MapReduce
CS 345A Data Mining MapReduce Single-node architecture CPU Machine Learning, Statistics Memory Classical Data Mining Disk Commodity Clusters Web data sets can be very large Tens to hundreds of terabytes
Google Disk Farm. Early days
Google Disk Farm Early days today CS 5204 Fall, 2007 2 Design Design factors Failures are common (built from inexpensive commodity components) Files large (multi-gb) mutation principally via appending
Section 14: Distributed Storage
CS162 May 5, 2016 Contents 1 Problems 2 1.1 NFS................................................ 2 1.2 Expanding on Two Phase Commit............................... 5 1 1 Problems 1.1 NFS You should do these
Distributed Systems. Lec 9: Distributed File Systems NFS, AFS. Slide acks: Dave Andersen
Distributed Systems Lec 9: Distributed File Systems NFS, AFS Slide acks: Dave Andersen (http://www.cs.cmu.edu/~dga/15-440/f10/lectures/08-distfs1.pdf) 1 VFS and FUSE Primer Some have asked for some background
Exam 2 Review. October 29, Paul Krzyzanowski 1
Exam 2 Review October 29, 2015 2013 Paul Krzyzanowski 1 Question 1 Why did Dropbox add notification servers to their architecture? To avoid the overhead of clients polling the servers periodically to check
Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017
Hadoop File System 1 S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y Moving Computation is Cheaper than Moving Data Motivation: Big Data! What is BigData? - Google
18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.
18-hdfs-gfs.txt Thu Oct 27 10:05:07 2011 1 Notes on Parallel File Systems: HDFS & GFS 15-440, Fall 2011 Carnegie Mellon University Randal E. Bryant References: Ghemawat, Gobioff, Leung, "The Google File
CS-537: Final Exam (Fall 2013) The Worst Case
CS-537: Final Exam (Fall 2013) The Worst Case Please Read All Questions Carefully! There are ten (10) total numbered pages. Please put your NAME (mandatory) on THIS page, and this page only. Name: 1 The
HDFS: Hadoop Distributed File System. Sector: Distributed Storage System
GFS: Google File System Google C/C++ HDFS: Hadoop Distributed File System Yahoo Java, Open Source Sector: Distributed Storage System University of Illinois at Chicago C++, Open Source 2 System that permanently
18-hdfs-gfs.txt Thu Nov 01 09:53: Notes on Parallel File Systems: HDFS & GFS , Fall 2012 Carnegie Mellon University Randal E.
18-hdfs-gfs.txt Thu Nov 01 09:53:32 2012 1 Notes on Parallel File Systems: HDFS & GFS 15-440, Fall 2012 Carnegie Mellon University Randal E. Bryant References: Ghemawat, Gobioff, Leung, "The Google File
AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi, Akshay Kanwar, Lovenish Saluja
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 2 Issue 10 October, 2013 Page No. 2958-2965 Abstract AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi,
MODELS OF DISTRIBUTED SYSTEMS
Distributed Systems Fö 2/3-1 Distributed Systems Fö 2/3-2 MODELS OF DISTRIBUTED SYSTEMS Basic Elements 1. Architectural Models 2. Interaction Models Resources in a distributed system are shared between
The Google File System GFS
The Google File System GFS Common Goals of GFS and most Distributed File Systems Performance Reliability Scalability Availability Other GFS Concepts Component failures are the norm rather than the exception.
THE ANDREW FILE SYSTEM BY: HAYDER HAMANDI
THE ANDREW FILE SYSTEM BY: HAYDER HAMANDI PRESENTATION OUTLINE Brief History AFSv1 How it works Drawbacks of AFSv1 Suggested Enhancements AFSv2 Newly Introduced Notions Callbacks FID Cache Consistency
Google File System. By Dinesh Amatya
Google File System By Dinesh Amatya Google File System (GFS) Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung designed and implemented to meet rapidly growing demand of Google's data processing need a scalable
Seminar Report On. Google File System. Submitted by SARITHA.S
Seminar Report On Submitted by SARITHA.S In partial fulfillment of requirements in Degree of Master of Technology (MTech) In Computer & Information Systems DEPARTMENT OF COMPUTER SCIENCE COCHIN UNIVERSITY
COMMUNICATION IN DISTRIBUTED SYSTEMS
Distributed Systems Fö 3-1 Distributed Systems Fö 3-2 COMMUNICATION IN DISTRIBUTED SYSTEMS Communication Models and their Layered Implementation 1. Communication System: Layered Implementation 2. Network
Filesystems Lecture 11
Filesystems Lecture 11 Credit: Uses some slides by Jehan-Francois Paris, Mark Claypool and Jeff Chase DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh,
BigTable. CSE-291 (Cloud Computing) Fall 2016
BigTable CSE-291 (Cloud Computing) Fall 2016 Data Model Sparse, distributed persistent, multi-dimensional sorted map Indexed by a row key, column key, and timestamp Values are uninterpreted arrays of bytes
CS164 Final Exam Winter 2013
CS164 Final Exam Winter 2013 Name: Last 4 digits of Student ID: Problem 1. State whether each of the following statements is true or false. (Two points for each correct answer, 1 point for each incorrect
Replication. Feb 10, 2016 CPSC 416
Replication Feb 10, 2016 CPSC 416 How d we get here? Failures & single systems; fault tolerance techniques added redundancy (ECC memory, RAID, etc.) Conceptually, ECC & RAID both put a master in front
DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES
DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Outline System Architectural Design Issues Centralized Architectures Application
MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2008 Quiz II
Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.830 Database Systems: Fall 2008 Quiz II There are 14 questions and 11 pages in this quiz booklet. To receive
Distributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2017
Distributed Systems 15. Distributed File Systems Paul Krzyzanowski Rutgers University Fall 2017 1 Google Chubby ( Apache Zookeeper) 2 Chubby Distributed lock service + simple fault-tolerant file system
Introduction to Distributed Data Systems
Introduction to Distributed Data Systems Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook January
Lecture (11) OSI layer 4 protocols TCP/UDP protocols
Lecture (11) OSI layer 4 protocols TCP/UDP protocols Dr. Ahmed M. ElShafee ١ Agenda Introduction Typical Features of OSI Layer 4 Connectionless and Connection Oriented Protocols OSI Layer 4 Common feature:
Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao
Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI 2006 Presented by Xiang Gao 2014-11-05 Outline Motivation Data Model APIs Building Blocks Implementation Refinement
Programming Systems for Big Data
Programming Systems for Big Data CS315B Lecture 17 Including material from Kunle Olukotun Prof. Aiken CS 315B Lecture 17 1 Big Data We ve focused on parallel programming for computational science There
!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced?
Chapter 10: Virtual Memory Questions? CSCI [4 6] 730 Operating Systems Virtual Memory!! What is virtual memory and when is it useful?!! What is demand paging?!! When should pages in memory be replaced?!!
Network File System (NFS)
Network File System (NFS) Nima Honarmand User A Typical Storage Stack (Linux) Kernel VFS (Virtual File System) ext4 btrfs fat32 nfs Page Cache Block Device Layer Network IO Scheduler Disk Driver Disk NFS
CS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 9 MapReduce Prof. Li Jiang 2014/11/19 1 What is MapReduce Origin from Google, [OSDI 04] A simple programming model Functional model For large-scale
Lecture 5: RMI etc. Servant. Java Remote Method Invocation Invocation Semantics Distributed Events CDK: Chapter 5 TVS: Section 8.3
Lecture 5: RMI etc. Java Remote Method Invocation Invocation Semantics Distributed Events CDK: Chapter 5 TVS: Section 8.3 CDK Figure 5.7 The role of proxy and skeleton in remote method invocation client
Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017
Operating Systems Lecture 7.2 - File system implementation Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Design FAT or indexed allocation? UFS, FFS & Ext2 Journaling with Ext3
Chapter 11: File System Implementation. Objectives
Chapter 11: File System Implementation Objectives To describe the details of implementing local file systems and directory structures To describe the implementation of remote file systems To discuss block
System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files
System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files Addressable by a filename ( foo.txt ) Usually supports hierarchical
Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13
Bigtable A Distributed Storage System for Structured Data Presenter: Yunming Zhang Conglong Li References SOCC 2010 Key Note Slides Jeff Dean Google Introduction to Distributed Computing, Winter 2008 University
CCNA 1 Chapter 7 v5.0 Exam Answers 2013
CCNA 1 Chapter 7 v5.0 Exam Answers 2013 1 A PC is downloading a large file from a server. The TCP window is 1000 bytes. The server is sending the file using 100-byte segments. How many segments will the
Optimizing Performance: Intel Network Adapters User Guide
Optimizing Performance: Intel Network Adapters User Guide Network Optimization Types When optimizing network adapter parameters (NIC), the user typically considers one of the following three conditions
CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 18: Naming, Directories, and File Caching
CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2004 Lecture 18: Naming, Directories, and File Caching 18.0 Main Points How do users name files? What is a name? Lookup:
Introduction to MapReduce
Basics of Cloud Computing Lecture 4 Introduction to MapReduce Satish Srirama Some material adapted from slides by Jimmy Lin, Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google Distributed
416 practice questions (PQs)
416 practice questions (PQs) 1. Goal: give you some material to study for the final exam and to help you to more actively engage with the material we cover in class. 2. Format: questions that are in scope
Lecture 18: Reliable Storage
CS 422/522 Design & Implementation of Operating Systems Lecture 18: Reliable Storage Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of
Chapter 17: Distributed-File Systems. Operating System Concepts 8 th Edition,
Chapter 17: Distributed-File Systems, Silberschatz, Galvin and Gagne 2009 Chapter 17 Distributed-File Systems Background Naming and Transparency Remote File Access Stateful versus Stateless Service File
CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions Transactions Main issues: Concurrency control Recovery from failures 2 Distributed Transactions
Distributed Systems. Lec 11: Consistency Models. Slide acks: Jinyang Li, Robert Morris
Distributed Systems Lec 11: Consistency Models Slide acks: Jinyang Li, Robert Morris (http://pdos.csail.mit.edu/6.824/notes/l06.txt, http://www.news.cs.nyu.edu/~jinyang/fa09/notes/ds-consistency.pdf) 1
CS43: Computer Networks Reliable Data Transfer. Kevin Webb Swarthmore College October 5, 2017
CS43: Computer Networks Reliable Data Transfer Kevin Webb Swarthmore College October 5, 2017 Agenda Today: General principles of reliability Next time: details of one concrete, very popular protocol: TCP
Final Exam for ECE374 05/03/12 Solution!!
ECE374: Second Midterm 1 Final Exam for ECE374 05/03/12 Solution!! Instructions: Put your name and student number on each sheet of paper! The exam is closed book. You have 90 minutes to complete the exam.
Operating Systems CMPSC 473. File System Implementation April 1, Lecture 19 Instructor: Trent Jaeger
Operating Systems CMPSC 473 File System Implementation April 1, 2008 - Lecture 19 Instructor: Trent Jaeger Last class: File System Interface Today: File System Implementation Disks as Secondary Store What
OPERATING SYSTEM. Chapter 12: File System Implementation
OPERATING SYSTEM Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management
Computer Science 3CN3 and Software Engineering 4C03 Final Exam Answer Key
Computer Science 3CN3 and Software Engineering 4C03 Final Exam Answer Key DAY CLASS Dr. William M. Farmer DURATION OF EXAMINATION: 2 Hours MCMASTER UNIVERSITY FINAL EXAMINATION April 2008 THIS EXAMINATION
Chapter 11: Implementing File Systems
Chapter 11: Implementing File Systems Operating System Concepts 99h Edition DM510-14 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation
Distributing Software in a Massively Parallel Environment
Distributing Software in a Massively Parallel Environment LISA 2014 Dinah McNutt Release Engineer, Google, Inc. November 12, 2014 Problem: Reliably and consistently distributing software in a Laaaaaaaaaaaarge
Internet Technology 3/2/2016
Question 1 Defend or contradict this statement: for maximum efficiency, at the expense of reliability, an application should bypass TCP or UDP and use IP directly for communication. Internet Technology
Announcements. Reading: Chapter 16 Project #5 Due on Friday at 6:00 PM. CMSC 412 S10 (lect 24) copyright Jeffrey K.
Announcements Reading: Chapter 16 Project #5 Due on Friday at 6:00 PM 1 Distributed Systems Provide: access to remote resources security location independence load balancing Basic Services: remote login
Efficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488)
Efficiency Efficiency: Indexing (COSC 488) Nazli Goharian nazli@cs.georgetown.edu Difficult to analyze sequential IR algorithms: data and query dependency (query selectivity). O(q(cf max )) -- high estimate-
COMP/ELEC 429/556 Introduction to Computer Networks
COMP/ELEC 429/556 Introduction to Computer Networks The TCP Protocol Some slides used with permissions from Edward W. Knightly, T. S. Eugene Ng, Ion Stoica, Hui Zhang T. S. Eugene Ng eugeneng at cs.rice.edu
Introduction to Cloud Computing
Introduction to Cloud Computing Distributed File Systems 15 319, spring 2010 12 th Lecture, Feb 18 th Majd F. Sakr Lecture Motivation Quick Refresher on Files and File Systems Understand the importance
Unit 2.
Unit 2 Unit 2 Topics Covered: 1. PROCESS-TO-PROCESS DELIVERY 1. Client-Server 2. Addressing 2. IANA Ranges 3. Socket Addresses 4. Multiplexing and Demultiplexing 5. Connectionless Versus Connection-Oriented
Distributed Systems. 05r. Case study: Google Cluster Architecture. Paul Krzyzanowski. Rutgers University. Fall 2016
Distributed Systems 05r. Case study: Google Cluster Architecture Paul Krzyzanowski Rutgers University Fall 2016 1 A note about relevancy This describes the Google search cluster architecture in the mid
Operating Systems CMPSC 473 File System Implementation April 10, Lecture 21 Instructor: Trent Jaeger
Operating Systems CMPSC 473 File System Implementation April 10, 2008 - Lecture 21 Instructor: Trent Jaeger Last class: File System Implementation Basics Today: File System Implementation Optimizations
Managing Firewall Services
CHAPTER 11 Firewall Services manages firewall-related policies in Security Manager that apply to the Adaptive Security Appliance (ASA), PIX Firewall (PIX), Catalyst Firewall Services Module (FWSM), and
Applications of Paxos Algorithm
Applications of Paxos Algorithm Gurkan Solmaz COP 6938 - Cloud Computing - Fall 2012 Department of Electrical Engineering and Computer Science University of Central Florida - Orlando, FL Oct 15, 2012 1
DISTRIBUTED COMPUTER SYSTEMS
DISTRIBUTED COMPUTER SYSTEMS Communication Fundamental REMOTE PROCEDURE CALL Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Outline Communication Architecture Fundamentals
Distributed Systems 8L for Part IB. Additional Material (Case Studies) Dr. Steven Hand
Distributed Systems 8L for Part IB Additional Material (Case Studies) Dr. Steven Hand 1 Introduction The Distributed Systems course covers a wide range of topics in a variety of areas This handout includes
CS5412: OTHER DATA CENTER SERVICES
1 CS5412: OTHER DATA CENTER SERVICES Lecture V Ken Birman Tier two and Inner Tiers 2 If tier one faces the user and constructs responses, what lives in tier two? Caching services are very common (many
DFS Case Studies, Part 1
DFS Case Studies, Part 1 An abstract "ideal" model and Sun's NFS An Abstract Model File Service Architecture an abstract architectural model that is designed to enable a stateless implementation of the
Chapter 1: Distributed Systems: What is a distributed system? Fall 2013
Chapter 1: Distributed Systems: What is a distributed system? Fall 2013 Course Goals and Content n Distributed systems and their: n Basic concepts n Main issues, problems, and solutions n Structured and
Improving I/O Bandwidth With Cray DVS Client-Side Caching
Improving I/O Bandwidth With Cray DVS Client-Side Caching Bryce Hicks Cray Inc. Bloomington, MN USA bryceh@cray.com Abstract Cray s Data Virtualization Service, DVS, is an I/O forwarder providing access
File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems
File system internals Tanenbaum, Chapter 4 COMP3231 Operating Systems Architecture of the OS storage stack Application File system: Hides physical location of data on the disk Exposes: directory hierarchy,
CPS 512 midterm exam #1, 10/7/2016
CPS 512 midterm exam #1, 10/7/2016 Your name please: NetID: Answer all questions. Please attempt to confine your answers to the boxes provided. If you don t know the answer to a question, then just say
Protocol Overview. TCP/IP Performance. Connection Types in TCP/IP. Resource Management. Router Queues. Control Mechanisms ITL
Protocol Overview TCP/IP Performance E-Mail HTTP (WWW) Remote Login File Transfer TCP UDP ITL IP ICMP ARP RARP (Auxiliary Services) ATM Ethernet, X.25, HDLC etc. 2/13/06 Hans Kruse & Shawn Ostermann, Ohio
COMPARATIVE STUDY OF TWO MODERN FILE SYSTEMS: NTFS AND HFS+
COMPARATIVE STUDY OF TWO MODERN FILE SYSTEMS: NTFS AND HFS+ Viral H. Panchal 1, Brijal Panchal 2, Heta K. Desai 3 Asst. professor, Computer Engg., S.N.P.I.T&RC, Umrakh, Gujarat, India 1 Student, Science
ECS High Availability Design
ECS High Availability Design March 2018 A Dell EMC white paper Revisions Date Mar 2018 Aug 2017 July 2017 Description Version 1.2 - Updated to include ECS version 3.2 content Version 1.1 - Updated to include
TECHNISCHE UNIVERSITEIT EINDHOVEN Faculteit Wiskunde en Informatica
TECHNISCHE UNIVERSITEIT EINDHOVEN Faculteit Wiskunde en Informatica Examination Architecture of Distributed Systems (2IMN10), on Monday November 7, 2016, from 13.30 to 16.30 hours. Before you start, read
Page 1. CS194-3/CS16x Introduction to Systems. Lecture 21. Networking II. Review: Networking Definitions
Review: Networking Definitions CS194-3/CS16x Introduction to Systems Lecture 21 Networking II November 7, 2007 Prof. Anthony D. Joseph http://www.cs.berkeley.edu/~adj/cs16x Network: physical connection
Computer Networks - Midterm
Computer Networks - Midterm October 28, 2016 Duration: 2h15m This is a closed-book exam Please write your answers on these sheets in a readable way, in English or in French You can use extra sheets if
Page 1. Key Value Storage" System Examples" Key Values: Examples "
CS162 Operating Systems and Systems Programming Lecture 15 Key-Value Storage, Network Protocols" October 22, 2012! Ion Stoica! http://inst.eecs.berkeley.edu/~cs162! Key Value Storage" Interface! put(key,
Chapter 12: File System Implementation
Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency
NWEN 243. Networked Applications. Layer 4 TCP and UDP
NWEN 243 Networked Applications Layer 4 TCP and UDP 1 About the second lecturer Aaron Chen Office: AM405 Phone: 463 5114 Email: aaron.chen@ecs.vuw.ac.nz Transport layer and application layer protocols
File-System Structure
Chapter 12: File System Implementation File System Structure File System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Log-Structured
Final Exam Solutions
Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.824 Spring 2004 Final Exam Solutions The average score was 84, the standard deviation was 6. 1 I Short
CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:
CS 470 Spring 2017 Mike Lam, Professor Distributed Web and File Systems Content taken from the following: "Distributed Systems: Principles and Paradigms" by Andrew S. Tanenbaum and Maarten Van Steen (Chapters
Chapter 18 Distributed Systems and Web Services
Chapter 18 Distributed Systems and Web Services Outline 18.1 Introduction 18.2 Distributed File Systems 18.2.1 Distributed File System Concepts 18.2.2 Network File System (NFS) 18.2.3 Andrew File System
CS 345A Data Mining. MapReduce
CS 345A Data Mining MapReduce Single-node architecture CPU Machine Learning, Statistics Memory Classical Data Mining Dis Commodity Clusters Web data sets can be ery large Tens to hundreds of terabytes