Cloud Computing. Lectures 11, 12 and 13 Cloud Storage

Size: px
Start display at page:

Download "Cloud Computing. Lectures 11, 12 and 13 Cloud Storage"

Transcription

1 Cloud Computing Lectures 11, 12 and 13 Cloud Storage

2 Up until now Introduction Definition of Cloud Computing Grid Computing Content Distribution Networks Cycle-Sharing Distributed Scheduling Map Reduce 2

3 Outline Components of Cloud Platforms Storage Types Storage Products Cloud File Systems Cloud Object Storage 3

4 Components of Cloud Computing Platforms Programming Model How to program an application? How is the platform viewed? Monitoring Execution Model Data Storage Which abstraction is accessible: VM? API? Framework? Which operations can I perform? How are my data stored and accessed? Monitoring: How can I evaluate the state of executions/nodes/data...? 4

5 Major Cloud Platforms Apache Hadoop Amazon Web Services Google App Engine Microsoft Azure OpenStack 5

6 Storage Types A range of search, streaming and indexing variants. File System: Hierarchical organization, files, permission, streaming data,... Object Storage: Direct Program <-> Storage interaction Object ID indexing Tables (no-sql DB): records and tables Search No relational model Relational Databases: Full relational model Conventional services We will see that the categories are becoming blurred... 6

7 Storage Products (i) File System Hadoop File System / Google File System Object/Byte Storage Amazon S3 MS Azure Blobs Table Hadoop HBase / Google Big Table (AppEngine Datastore) Amazon Simple DB MS Azure Tables Hadoop Hive Yahoo PNUTS Relational Databases Amazon RDS SQL Azure 7

8 Cloud File System: HDFS/GFS Distributed File System Reimplementation of the Google File System (GFS). Runs on clusters of generic machines. HDFS is tuned for: Very large files. Streaming access. Generic hardware. Scalability Key: data operations don t go through the central server. 8

9 Blocks Simplify space management: allocation, replication and a file may grow almost indefinitely. Evolution: Disk blocks: 512 bytes File system blocks: 2,4,8 kb HDFS blocks: 64MB To eliminate seek steps: contiguous 64MB. A file smaller than one block does not occupy 1 block. 9

10 Namenode Manages the file system name space: folder hierarchy, name uniqueness, Maintains the folder tree and the metadata in 2 files: namespace image and edit log. HDFS cannot operate without the namenode. Files can be written, read, renamed and deleted. It is not possible to: Write in the middle of the file. Write concurrently to the same file. Fault tolerance mechanism: atomic replication to another machine. 10

11 Datanode Manage a set of blocks. Process clients or namenode s writing/reading requests. Periodically notifies the namenodeof the blocks it holds.. If a block s replication factor drops below a configuration value, a new replica is created. 11

12 Permissions Permissions in HDFS are similar to UNIX: user, groupe other read, writee execute As the user is very often remote, any username from a remote node is trusted. Therefore, protection is weak. They are more geared towards managing a group of users in the cluster. 12

13 Consistency Model Formalization of the visibility of read and write operations. After an operation call finishes, who sees what and when? HDFS model: There are no guarantees that the last block has been written unless sync()is called. 13

14 Error Checking The block correction is checked using a hashing function (CRC32 - checksum). At file creation: Client calculates the checksum for each 512 byte block. Datanode stores the checksum. At file access: Client reads the data and the checksum from the datanode. If the check fails, it tries other replicas. Periodically, the datanodechecks its blocks checksum. 14

15 Reading Client contacts the namenodeto get the list of the datanodeswith the file s blocks (stored in memory). Receives a FSDataInputStreamthat transparently chooses the best datanode, opens and closes connections to the datanodes, requests blocks from the namenode, repeats operations if necessary and logs failed datanodes. 15

16 Reading 16

17 Choosing Nodes: Distance Nodes choose the closer sources of data. Assumes a tree structured organization. Distance equal to the name of hops between the tree nodes. distance(/d1/r1/n1, /d1/r1/n1) = 0(processes on the same node) distance(/d1/r1/n1, /d1/r1/n2) = 2(processes on the same racks) distance(/d1/r1/n1, /d1/r2/n3) = 4(processes on different racks) distance(/d1/r1/n1, /d2/r3/n4) = 6(processes on different datacentres) 17

18 Distance Between Nodes 18

19 Writing (+ creating) Client requests a new file to the namenode checking permission and uniqueness. If it succeeds, it receives a FSOutputStream. Namenodeprovides a set of datanodesfor replication. Blocks write requests are kept in a data queue. Unconfirmed block write request are kept in a ack queue. 19

20 Writing 20

21 Writing In case the datanode fails, the client changes the block id so that the corrupted replica is deleted later. By default, if one of the replicas is successfully written, the writing is considered done. The other replicas are written asynchronously. 21

22 Command Line Tool hadoop fs ls mkdir rm rmr put copytolocal copyfromlocal 22

23 Cloud Object Store: Amazon Simple Storage System (S3) 23

24 S3 Amazon s persistent object storage system. Implementation based on the Dynamo system (SOSP, 2007). Accessible using HTTP: 3 different protocols, e.g. SOAP. 24

25 Dynamo: Intuition CAP Theorem: Consistency, Availability and Partition tolerance - Pick two! At Amazon: Availability = Client s trusts Cannot be sacrificed. In large data centres there are going to be frequent faults: The possibility of a partition has to be included. Most data services tolerate small inconsistencies: Relaxed consistency ==> Eventual consistency. 25

26 Consistency Models Strong Consistency: Once a write operations is finished for the requester, any subsequent read will return the value that was written. Weak Consistency:The system does not guarantee that subsequent accesses return the written value. Some condition must be verified for the written value to be returned (a time interval, an access to a synchro variable, ). The period between the write finishing and the value visibility is called the inconsistency window. Eventual Consistency: The system guarantees that, if there no more writes, the updates will become visible for all clients (e.g. DNS): a DNS name update is propagated between zones until all clients see the new value. 26

27 Variants of Eventual Consistency Causal Consistency: Two causally related writes (A happens before B) cannot lead to B being written before A. There are no guarantees regarding write operations that are not causally related. Read-your-writes Consistency: Every time a process A writes a value, all subsequent reads must reflect that write (a particular case of causal consistency). Session Consistency:A practical implementation of the previous model. All operations are done in the context of a session. During the session, the system guarantees read-your-writes. In the case of certain faults, the session is ended and the read-your-writes guarantee is restarted. Monotonic Reads Consistency: If a process has seen a subsequent value, subsequent reads will never return a previous value. Monotonic Writes Consistency:Systems that do not guarantee ordered writes in the same process. Very rare 27

28 Dynamo Assumptions Interaction Model: Total reads and writes with unique IDs. Binary objects with up to 5GB. No operations on multiple objects. ACID properties (Atomicity, Consistency, Isolation, Durability): Atomicity/Isolation: total writes of an object. Durability: replicated write. Only the consistency isn t strong. Efficiency: Optimize for the 99,9 percentile. 28

29 Design Decisions Incremental Scalability: Adding nodes has to be simple. Load balancing and support for heterogeneity: The system must distribute the requests. And support nodes with different characteristics. Solution: nodes in a Chord like DHT. 29

30 Design Decisions Symmetry: All nodes are equally responsible peers. Decentralization: Avoid single points of failure. 30

31 Dynamo: Design Decisions Problem Technique Advantage Partitioning Consistent Hashing Incremental Stability Write Availability Vector clocks and conflict resolution of writes Version size does not depend on the update rate Temporary Faults Relaxed quorum and hinted handoff High availability and durability Permanent Faults Anti-entropy with Merkle trees Synchronizes replicas asynchronously Membership and Fault detection Gossip based membership protocol Maintains symmetry and avoids and centralized directory 31

32 Dynamo: API Two operations: put(key, context, object) key: object ID. context: vector clocks and object s history. object: data to be written. get(key) 32

33 Partitioning and Replication Uses consistent hashing. Similar to Chord: Each node has an id in the key space. Nodes are arranged in a ring. Data are stored in the node with the lowest key that is larger than the object s Replication: All objects are replicated in the N nodes that follow the node associated with the object. 33

34 The Chord Ring with Replication 34

35 Virtual Nodes Problem: few nodes or heterogeneous nodes lead to bad load balancing. Dynamo solution: Use virtual nodes Each physical nodes has several virtual node tickets. More powerful machines can have more tickets. Virtual node tickets are distributed randomly. 35

36 Data Versions Nodes for writing and reading are selected based on load. So, we have eventual consistency: There may be different versions written on different replicas. Conflict resolution is made when reading and not when writing. Syntactic Reconciliation: Some changes can be made automatically. For formats with clearly identifiable parts and operations (e.g. mail file). Semantic Reconciliation: The user must decide. Divergence is uncommon. For all read operations: 99.94% -1 version; % -2 versions; % -3 versions; % -4 versions. Timeout: After a number of generations without writing, versions are discarded. 36

37 Vector Clocks (i) Represents time in a distributed system without clock sync. Replaces physical time with causality. A vector clock is a list of (node, counter) pairs. If all positions of the vector clock time of an event A are smaller than those of another event B then A happened before B. There is a causal chain of events from A to B. 37

38 Vector Clocks (ii) Real time 38

39 Object Versions If we assign a vector clock timestamp to all object versions we can detect divergent replicas. Example: X, Y e Z are servers with replicas of object D. D5 is a semantic reconciliation performed by the user. 39

40 Executing get()e put() For good performance, two possibilities: Route requests through a load balancer that chooses the node based on the load: Creates a bottleneck. Use a client side library to choose the node where to send the request (which will be the coordinator): Requires recompiling the client. Probably irrelevant in AWS. Then the coordinator executes the quorum reads or writes. 40

41 Read/Write Operations Dynamo supports writing and reading using a quorum model. This allows not waiting for all replicas when you do an operation. Consider R and W are the number or read and write replicas that must synchronously take part in an operation. If R + W > N we have a quorum based system, then the set of replicas used for writing always overlap with the set of read replicas: It is impossible to read an object without seeing the latest written object. Latency is determined by the slowest node in the R (or W) set. Therefore, to improve performance, one lowers R or W. 41

42 Sloppy Quorum To ensure availability, Dynamo uses a sloppy quorum. Each data item is stored on N nodes of list spanning multiple machines and data centers (preference list). Operations are performed not on the N existing replicas but on the first healthy N nodes on the preference list. 42

43 Tolerating Temporary Faults: Hinted Handoff Assuming N = 3. If A is unavailable or fails when we write, send a replica to D. D marks the replica as temporary and returns the data to A as soon as it recovers. Replicas are chosen from a preference list of nodes. Preference lists always span multiple datacenters for fault tolerance. 43

44 Membership and Fault Detection Ring Membership: At startup use an external entry point to avoid partitioned rings. Gossip asynchronously to update the DHT. Exchange membership lists with random node every 2 seconds. Fault Detection: Faults are detected by neighbours with periodic messages with a timeout on reply. 44

45 Permanent Faults When a hinted replica (that has write-ops belonging to another replica) is considered failed: Data is synchronized with the new replica using Merkle trees. 45

46 Merkle Trees Accelerates synchronization between nodes by comparing trees of hashes. Each tree node has a hash of the children. It makes it very easy to identify what needs to be exchanged. The update can be asynchronous: An out-of-date tree is not serious. 46

47 Merkle Trees: Dynamo Each node has a set of keys. All objects are leafs of the Merkle tree. Replicas exchange the top of the Merkle tree periodically. If it's different, they recursively exchange the hash of lower nodes. 47

48 Back to S3 Additional issues when compared to Dynamo: Access to S3 is controlled by an ACL based on the clients AWS identity and checked with their secret key. Occasionally, some S3 calls fail and must be repeated. Programs accessing S3 should take this into account. Dynamo replication is performed between data centers. This large scale replication has some lag. 48

49 Service Level Agreements Hosting contracts and cloud platforms, like S3, include SLAs. Very often described as average, median and/or variances of response times: Extreme cases are always problematic. Amazon optimizes for 99,9% of the requests: Example: 300ms response time for 99,9% of the requests below a peak request rate of 500 request per second. 49

50 Buckets and Objects S3 data are stored as Dynamo objects. Operations on objects are: PUT, GET, DELETE, HEAD (get metadata) Objects can be grouped in buckets. Buckets are used for delimiting namespaces:

51 S3: REST GET Sample Request GET /my-image.jpg HTTP/1.1 Host: bucket.s3.amazonaws.com Date: Wed, 28 Oct :32:00 GMT See developer-guide/restauthentication.html Authorization: AWS 02236Q3V0WHVSRW0EXG2:0RQf4/cRonhpaBX5sCYVf1bNRuU= Sample Response HTTP/ OK x-amz-id-2: eftixk72ad6ap51tnqcof8efidjg9z/2mkidfu8yu9as1ed4opiszj7udnehgran x-amz-request-id: 318BC8BC148832E5 Date: Wed, 28 Oct :32:00 GMT Last-Modified: Wed, 12 Oct :50:00 GMT ETag: "fba9dede5f27731c a " Content-Length: Content-Type: text/plain Connection: close Server: AmazonS3 [ bytes of object data] 51

52 S3: REST PUT Sample Request PUT /my-image.jpg HTTP/1.1 Host: mybucket.s3.amazonaws.com Date: Wed, 12 Oct :50:00 GMT Authorization: AWS 15B4D3461F A:xQE0diMbLRepdf3YB+FIEXAMPLE= Content-Type: text/plain Content-Length: Expect: 100-continue [11434 bytes of object data] Sample Response HTTP/ Continue HTTP/ OK x-amz-id-2: LriYPLdmOdAiIfgSm/F1YsViT1LW94/xUQxMsF7xiEb1a0wiIOIxl+zbwZ163pt7 x-amz-request-id: 0A49CE EAC x-amz-version-id: 43jfkodU8493jnFJD9fjj3HHNVfdsQUIFDNsidf038jfdsjGFDSIRp Date: Wed, 12 Oct :50:00 GMT ETag: "fbacf535f27731c a " Content-Length: 0 Connection: close Server: AmazonS3 52

53 S3: REST in Java public void createbucket() throws Exception { // S3 timestamp pattern. String fmt = "EEE, dd MMM yyyy HH:mm:ss "; SimpleDateFormat df = new SimpleDateFormat(fmt, Locale.US); df.settimezone(timezone.gettimezone(" GMT")); // Data needed for signature String method = "PUT"; String contentmd5 = ""; String contenttype = ""; String date = df.format(new Date()) + "GMT"; String bucket = "/onjava"; // Generate signature StringBuffer buf = new StringBuffer(); buf.append(method).append("\n"); buf.append(contentmd5).append("\n"); buf.append(contenttype).append("\n"); buf.append(date).append("\n"); buf.append(bucket); String signature = sign(buf.tostring()); // Connection to s3.amazonaws.com HttpURLConnection httpconn = null; URL url = new URL("http","s3.amazonaws.com",80,bucket ); httpconn = (HttpURLConnection) url.openconnection(); httpconn.setdoinput(true); httpconn.setdooutput(true); httpconn.setusecaches(false); httpconn.setdefaultusecaches(false); httpconn.setallowuserinteraction(true); httpconn.setrequestmethod(method); httpconn.setrequestproperty("date", date); httpconn.setrequestproperty("content- Length", "0"); String AWSAuth = "AWS " + keyid + ":" + signature; httpconn.setrequestproperty("authorizat ion", AWSAuth); // Send the HTTP PUT request. int statuscode = httpconn.getresponsecode(); if ((statuscode/100)!= 2) { // Deal with S3 error stream. InputStream in = httpconn.geterrorstream(); String errorstr = gets3errorcode(in); 53 }}

54 S3: REST in JetS3t String awsaccesskey = "YOUR_AWS_ACCESS_KEY"; String awssecretkey = "YOUR_AWS_SECRET_KEY"; AWSCredentials awscredentials = new AWSCredentials(awsAccessKey, awssecretkey); S3Service s3service = new RestS3Service(awsCredentials); S3Bucket eubucket = s3service.createbucket("eu-bucket", S3Bucket.LOCATION_EUROPE); 54

55 Windows Azure 55

56 Azure Storage (i) Volatile storage: Instance disk Memory cache Persistent Storage: Windows Azure Storage: Blobs (objects) Tables Queues SQL Azure: Relational DB 56

57 Azure Storage (ii) Service is accessible via Web Services or libraries on top of these (C#, VB, Java). Blobs, Tables e Queues are stored in partitions. Partitions are the replication and load balancing unit. Blobs and queues are not sharded. Tables may be. All partitions have 3 replicas. Partitions are represented in a DFS as one or more extents (contiguous files) of up to 1GB. 57

58 Blobs A blobis a <name, object> pair. Allows storage of objects from a few bytes up to 50GB. Blobs are stored in containers. There is no hierarchy in blob storage but it can be simulated because names may contain / s. URLs schema: ntainer>/<blobname> 58

59 Operations on Blobs Put: creating Get: reading Set: updating Delete: eliminating Lease: 1 minute locking. 59

60 Next Time... Storage in Cloud Platforms 60

Cloud Computing. Up until now

Cloud Computing. Up until now Cloud Computing Lecture 13 Cloud Storage 2011-2012 Up until now Introduction Definition of Cloud Computing Grid Computing Content Distribution Networks Cycle-Sharing Distributed Scheduling Map Reduce 1

More information

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE Presented by Byungjin Jun 1 What is Dynamo for? Highly available key-value storages system Simple primary-key only interface Scalable and Reliable Tradeoff:

More information

CS Amazon Dynamo

CS Amazon Dynamo CS 5450 Amazon Dynamo Amazon s Architecture Dynamo The platform for Amazon's e-commerce services: shopping chart, best seller list, produce catalog, promotional items etc. A highly available, distributed

More information

CAP Theorem, BASE & DynamoDB

CAP Theorem, BASE & DynamoDB Indian Institute of Science Bangalore, India भ रत य व ज ञ न स स थ न ब गल र, भ रत DS256:Jan18 (3:1) Department of Computational and Data Sciences CAP Theorem, BASE & DynamoDB Yogesh Simmhan Yogesh Simmhan

More information

Dynamo: Amazon s Highly Available Key-Value Store

Dynamo: Amazon s Highly Available Key-Value Store Dynamo: Amazon s Highly Available Key-Value Store DeCandia et al. Amazon.com Presented by Sushil CS 5204 1 Motivation A storage system that attains high availability, performance and durability Decentralized

More information

Dynamo: Amazon s Highly Available Key-value Store

Dynamo: Amazon s Highly Available Key-value Store Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and

More information

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation Dynamo Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/20 Outline Motivation 1 Motivation 2 3 Smruti R. Sarangi Leader

More information

Presented By: Devarsh Patel

Presented By: Devarsh Patel : Amazon s Highly Available Key-value Store Presented By: Devarsh Patel CS5204 Operating Systems 1 Introduction Amazon s e-commerce platform Requires performance, reliability and efficiency To support

More information

Dynamo: Amazon s Highly Available Key-value Store

Dynamo: Amazon s Highly Available Key-value Store Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and

More information

EECS 498 Introduction to Distributed Systems

EECS 498 Introduction to Distributed Systems EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Dynamo Recap Consistent hashing 1-hop DHT enabled by gossip Execution of reads and writes Coordinated by first available successor

More information

Dynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat

Dynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat Dynamo: Amazon s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat Dynamo An infrastructure to host services Reliability and fault-tolerance at massive scale Availability providing

More information

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini Large-Scale Key-Value Stores Eventual Consistency Marco Serafini COMPSCI 590S Lecture 13 Goals of Key-Value Stores Export simple API put(key, value) get(key) Simpler and faster than a DBMS Less complexity,

More information

Dynamo: Key-Value Cloud Storage

Dynamo: Key-Value Cloud Storage Dynamo: Key-Value Cloud Storage Brad Karp UCL Computer Science CS M038 / GZ06 22 nd February 2016 Context: P2P vs. Data Center (key, value) Storage Chord and DHash intended for wide-area peer-to-peer systems

More information

CS 138: Dynamo. CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

CS 138: Dynamo. CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. CS 138: Dynamo CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. Dynamo Highly available and scalable distributed data store Manages state of services that have high reliability and

More information

Distributed Hash Tables Chord and Dynamo

Distributed Hash Tables Chord and Dynamo Distributed Hash Tables Chord and Dynamo (Lecture 19, cs262a) Ion Stoica, UC Berkeley October 31, 2016 Today s Papers Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, Ion Stoica,

More information

Scaling Out Key-Value Storage

Scaling Out Key-Value Storage Scaling Out Key-Value Storage COS 418: Distributed Systems Logan Stafman [Adapted from K. Jamieson, M. Freedman, B. Karp] Horizontal or vertical scalability? Vertical Scaling Horizontal Scaling 2 Horizontal

More information

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage Horizontal or vertical scalability? Scaling Out Key-Value Storage COS 418: Distributed Systems Lecture 8 Kyle Jamieson Vertical Scaling Horizontal Scaling [Selected content adapted from M. Freedman, B.

More information

There is a tempta7on to say it is really used, it must be good

There is a tempta7on to say it is really used, it must be good Notes from reviews Dynamo Evalua7on doesn t cover all design goals (e.g. incremental scalability, heterogeneity) Is it research? Complexity? How general? Dynamo Mo7va7on Normal database not the right fit

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

Distributed Systems 16. Distributed File Systems II

Distributed Systems 16. Distributed File Systems II Distributed Systems 16. Distributed File Systems II Paul Krzyzanowski pxk@cs.rutgers.edu 1 Review NFS RPC-based access AFS Long-term caching CODA Read/write replication & disconnected operation DFS AFS

More information

Distributed Filesystem

Distributed Filesystem Distributed Filesystem 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributing Code! Don t move data to workers move workers to the data! - Store data on the local disks of nodes in the

More information

UNIT-IV HDFS. Ms. Selva Mary. G

UNIT-IV HDFS. Ms. Selva Mary. G UNIT-IV HDFS HDFS ARCHITECTURE Dataset partition across a number of separate machines Hadoop Distributed File system The Design of HDFS HDFS is a file system designed for storing very large files with

More information

CLOUD-SCALE FILE SYSTEMS

CLOUD-SCALE FILE SYSTEMS Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients

More information

The Google File System. Alexandru Costan

The Google File System. Alexandru Costan 1 The Google File System Alexandru Costan Actions on Big Data 2 Storage Analysis Acquisition Handling the data stream Data structured unstructured semi-structured Results Transactions Outline File systems

More information

4/9/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data Spring 2018 Colorado State University. FAQs. Architecture of GFS

4/9/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data Spring 2018 Colorado State University. FAQs. Architecture of GFS W13.A.0.0 CS435 Introduction to Big Data W13.A.1 FAQs Programming Assignment 3 has been posted PART 2. LARGE SCALE DATA STORAGE SYSTEMS DISTRIBUTED FILE SYSTEMS Recitations Apache Spark tutorial 1 and

More information

Intuitive distributed algorithms. with F#

Intuitive distributed algorithms. with F# Intuitive distributed algorithms with F# Natallia Dzenisenka Alena Hall @nata_dzen @lenadroid A tour of a variety of intuitivedistributed algorithms used in practical distributed systems. and how to prototype

More information

CS 655 Advanced Topics in Distributed Systems

CS 655 Advanced Topics in Distributed Systems Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3

More information

CS435 Introduction to Big Data FALL 2018 Colorado State University. 11/7/2018 Week 12-B Sangmi Lee Pallickara. FAQs

CS435 Introduction to Big Data FALL 2018 Colorado State University. 11/7/2018 Week 12-B Sangmi Lee Pallickara. FAQs 11/7/2018 CS435 Introduction to Big Data - FALL 2018 W12.B.0.0 CS435 Introduction to Big Data 11/7/2018 CS435 Introduction to Big Data - FALL 2018 W12.B.1 FAQs Deadline of the Programming Assignment 3

More information

Background. Distributed Key/Value stores provide a simple put/get interface. Great properties: scalability, availability, reliability

Background. Distributed Key/Value stores provide a simple put/get interface. Great properties: scalability, availability, reliability Background Distributed Key/Value stores provide a simple put/get interface Great properties: scalability, availability, reliability Increasingly popular both within data centers Cassandra Dynamo Voldemort

More information

FAQs Snapshots and locks Vector Clock

FAQs Snapshots and locks Vector Clock //08 CS5 Introduction to Big - FALL 08 W.B.0.0 CS5 Introduction to Big //08 CS5 Introduction to Big - FALL 08 W.B. FAQs Snapshots and locks Vector Clock PART. LARGE SCALE DATA STORAGE SYSTEMS NO SQL DATA

More information

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017 Hadoop File System 1 S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y Moving Computation is Cheaper than Moving Data Motivation: Big Data! What is BigData? - Google

More information

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques Recap Distributed Systems Case Study: Amazon Dynamo CAP Theorem? Consistency, Availability, Partition Tolerance P then C? A? Eventual consistency? Availability and partition tolerance over consistency

More information

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures GFS Overview Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures Interface: non-posix New op: record appends (atomicity matters,

More information

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman Scaling KVS CS6450: Distributed Systems Lecture 14 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed

More information

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018 Cloud Computing and Hadoop Distributed File System UCSB CS70, Spring 08 Cluster Computing Motivations Large-scale data processing on clusters Scan 000 TB on node @ 00 MB/s = days Scan on 000-node cluster

More information

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques Recap CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo CAP Theorem? Consistency, Availability, Partition Tolerance P then C? A?

More information

Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia,

Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia, Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia, Chansler}@Yahoo-Inc.com Presenter: Alex Hu } Introduction } Architecture } File

More information

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University CS 555: DISTRIBUTED SYSTEMS [DYNAMO & GOOGLE FILE SYSTEM] Frequently asked questions from the previous class survey What s the typical size of an inconsistency window in most production settings? Dynamo?

More information

11/12/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data FALL 2018 Colorado State University

11/12/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data FALL 2018 Colorado State University S Introduction to ig Data Week - W... S Introduction to ig Data W.. FQs Term project final report Preparation guide is available at: http://www.cs.colostate.edu/~cs/tp.html Successful final project must

More information

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi 1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.

More information

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs 1 Anand Tripathi CSci 8980 Operating Systems Lecture Notes 1 Basic Concepts Distributed Systems A set of computers (hosts or nodes) connected through a communication network. Nodes may have different speeds

More information

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Distributed Systems Lec 10: Distributed File Systems GFS Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 1 Distributed File Systems NFS AFS GFS Some themes in these classes: Workload-oriented

More information

10. Replication. Motivation

10. Replication. Motivation 10. Replication Page 1 10. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure

More information

Haridimos Kondylakis Computer Science Department, University of Crete

Haridimos Kondylakis Computer Science Department, University of Crete CS-562 Advanced Topics in Databases Haridimos Kondylakis Computer Science Department, University of Crete QSX (LN2) 2 NoSQL NoSQL: Not Only SQL. User case of NoSQL? Massive write performance. Fast key

More information

SCALABLE CONSISTENCY AND TRANSACTION MODELS

SCALABLE CONSISTENCY AND TRANSACTION MODELS Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:

More information

CS60021: Scalable Data Mining. Sourangshu Bhattacharya

CS60021: Scalable Data Mining. Sourangshu Bhattacharya CS60021: Scalable Data Mining Sourangshu Bhattacharya In this Lecture: Outline: HDFS Motivation HDFS User commands HDFS System architecture HDFS Implementation details Sourangshu Bhattacharya Computer

More information

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

! Design constraints.  Component failures are the norm.  Files are huge by traditional standards. ! POSIX-like Cloud background Google File System! Warehouse scale systems " 10K-100K nodes " 50MW (1 MW = 1,000 houses) " Power efficient! Located near cheap power! Passive cooling! Power Usage Effectiveness = Total

More information

Dynamo: Amazon s Highly Available Key-value Store

Dynamo: Amazon s Highly Available Key-value Store Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and

More information

ZooKeeper & Curator. CS 475, Spring 2018 Concurrent & Distributed Systems

ZooKeeper & Curator. CS 475, Spring 2018 Concurrent & Distributed Systems ZooKeeper & Curator CS 475, Spring 2018 Concurrent & Distributed Systems Review: Agreement In distributed systems, we have multiple nodes that need to all agree that some object has some state Examples:

More information

11/5/2018 Week 12-A Sangmi Lee Pallickara. CS435 Introduction to Big Data FALL 2018 Colorado State University

11/5/2018 Week 12-A Sangmi Lee Pallickara. CS435 Introduction to Big Data FALL 2018 Colorado State University 11/5/2018 CS435 Introduction to Big Data - FALL 2018 W12.A.0.0 CS435 Introduction to Big Data 11/5/2018 CS435 Introduction to Big Data - FALL 2018 W12.A.1 Consider a Graduate Degree in Computer Science

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

Introduction to Distributed Data Systems

Introduction to Distributed Data Systems Introduction to Distributed Data Systems Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook January

More information

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University CPSC 426/526 Cloud Computing Ennan Zhai Computer Science Department Yale University Recall: Lec-7 In the lec-7, I talked about: - P2P vs Enterprise control - Firewall - NATs - Software defined network

More information

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems 16. Distributed Lookup Paul Krzyzanowski Rutgers University Fall 2017 1 Distributed Lookup Look up (key, value) Cooperating set of nodes Ideally: No central coordinator Some nodes can

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.

More information

6.830 Lecture Spark 11/15/2017

6.830 Lecture Spark 11/15/2017 6.830 Lecture 19 -- Spark 11/15/2017 Recap / finish dynamo Sloppy Quorum (healthy N) Dynamo authors don't think quorums are sufficient, for 2 reasons: - Decreased durability (want to write all data at

More information

Distributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems 15. Distributed File Systems Paul Krzyzanowski Rutgers University Fall 2017 1 Google Chubby ( Apache Zookeeper) 2 Chubby Distributed lock service + simple fault-tolerant file system

More information

CS /30/17. Paul Krzyzanowski 1. Google Chubby ( Apache Zookeeper) Distributed Systems. Chubby. Chubby Deployment.

CS /30/17. Paul Krzyzanowski 1. Google Chubby ( Apache Zookeeper) Distributed Systems. Chubby. Chubby Deployment. Distributed Systems 15. Distributed File Systems Google ( Apache Zookeeper) Paul Krzyzanowski Rutgers University Fall 2017 1 2 Distributed lock service + simple fault-tolerant file system Deployment Client

More information

Google File System. By Dinesh Amatya

Google File System. By Dinesh Amatya Google File System By Dinesh Amatya Google File System (GFS) Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung designed and implemented to meet rapidly growing demand of Google's data processing need a scalable

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google* 정학수, 최주영 1 Outline Introduction Design Overview System Interactions Master Operation Fault Tolerance and Diagnosis Conclusions

More information

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 1: Distributed File Systems GFS (The Google File System) 1 Filesystems

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Distributed File Systems 15 319, spring 2010 12 th Lecture, Feb 18 th Majd F. Sakr Lecture Motivation Quick Refresher on Files and File Systems Understand the importance

More information

CISC 7610 Lecture 2b The beginnings of NoSQL

CISC 7610 Lecture 2b The beginnings of NoSQL CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone

More information

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5. Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message

More information

Reprise: Stability under churn (Tapestry) A Simple lookup Test. Churn (Optional Bamboo paper last time)

Reprise: Stability under churn (Tapestry) A Simple lookup Test. Churn (Optional Bamboo paper last time) EECS 262a Advanced Topics in Computer Systems Lecture 22 Reprise: Stability under churn (Tapestry) P2P Storage: Dynamo November 20 th, 2013 John Kubiatowicz and Anthony D. Joseph Electrical Engineering

More information

Google File System 2

Google File System 2 Google File System 2 goals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) focus on multi-gb files handle appends efficiently (no random writes & sequential reads) co-design

More information

Applications of Paxos Algorithm

Applications of Paxos Algorithm Applications of Paxos Algorithm Gurkan Solmaz COP 6938 - Cloud Computing - Fall 2012 Department of Electrical Engineering and Computer Science University of Central Florida - Orlando, FL Oct 15, 2012 1

More information

Distributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2016 Distributed Systems 15. Distributed File Systems Paul Krzyzanowski Rutgers University Fall 2016 1 Google Chubby 2 Chubby Distributed lock service + simple fault-tolerant file system Interfaces File access

More information

7680: Distributed Systems

7680: Distributed Systems Cristina Nita-Rotaru 7680: Distributed Systems GFS. HDFS Required Reading } Google File System. S, Ghemawat, H. Gobioff and S.-T. Leung. SOSP 2003. } http://hadoop.apache.org } A Novel Approach to Improving

More information

Map-Reduce. Marco Mura 2010 March, 31th

Map-Reduce. Marco Mura 2010 March, 31th Map-Reduce Marco Mura (mura@di.unipi.it) 2010 March, 31th This paper is a note from the 2009-2010 course Strumenti di programmazione per sistemi paralleli e distribuiti and it s based by the lessons of

More information

Distributed System. Gang Wu. Spring,2018

Distributed System. Gang Wu. Spring,2018 Distributed System Gang Wu Spring,2018 Lecture7:DFS What is DFS? A method of storing and accessing files base in a client/server architecture. A distributed file system is a client/server-based application

More information

Google File System (GFS) and Hadoop Distributed File System (HDFS)

Google File System (GFS) and Hadoop Distributed File System (HDFS) Google File System (GFS) and Hadoop Distributed File System (HDFS) 1 Hadoop: Architectural Design Principles Linear scalability More nodes can do more work within the same time Linear on data size, linear

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW

More information

FLAT DATACENTER STORAGE CHANDNI MODI (FN8692)

FLAT DATACENTER STORAGE CHANDNI MODI (FN8692) FLAT DATACENTER STORAGE CHANDNI MODI (FN8692) OUTLINE Flat datacenter storage Deterministic data placement in fds Metadata properties of fds Per-blob metadata in fds Dynamic Work Allocation in fds Replication

More information

CA485 Ray Walshe Google File System

CA485 Ray Walshe Google File System Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage

More information

Dynamo Tom Anderson and Doug Woos

Dynamo Tom Anderson and Doug Woos Dynamo motivation Dynamo Tom Anderson and Doug Woos Fast, available writes - Shopping cart: always enable purchases FLP: consistency and progress at odds - Paxos: must communicate with a quorum Performance:

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 26 File Systems Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ Cylinders: all the platters?

More information

The Google File System (GFS)

The Google File System (GFS) 1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints

More information

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017 HDFS Architecture Gregory Kesden, CSE-291 (Storage Systems) Fall 2017 Based Upon: http://hadoop.apache.org/docs/r3.0.0-alpha1/hadoopproject-dist/hadoop-hdfs/hdfsdesign.html Assumptions At scale, hardware

More information

CA485 Ray Walshe NoSQL

CA485 Ray Walshe NoSQL NoSQL BASE vs ACID Summary Traditional relational database management systems (RDBMS) do not scale because they adhere to ACID. A strong movement within cloud computing is to utilize non-traditional data

More information

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014 Cassandra @ Spotify Scaling storage to million of users world wide! Jimmy Mårdell October 14, 2014 2 About me Jimmy Mårdell Tech Product Owner in the Cassandra team 4 years at Spotify

More information

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store Oracle NoSQL Database A Distributed Key-Value Store Charles Lamb The following is intended to outline our general product direction. It is intended for information purposes only,

More information

NoSQL Concepts, Techniques & Systems Part 1. Valentina Ivanova IDA, Linköping University

NoSQL Concepts, Techniques & Systems Part 1. Valentina Ivanova IDA, Linköping University NoSQL Concepts, Techniques & Systems Part 1 Valentina Ivanova IDA, Linköping University 2017-03-20 2 Outline Today Part 1 RDBMS NoSQL NewSQL DBMS OLAP vs OLTP NoSQL Concepts and Techniques Horizontal scalability

More information

Distributed Systems. GFS / HDFS / Spanner

Distributed Systems. GFS / HDFS / Spanner 15-440 Distributed Systems GFS / HDFS / Spanner Agenda Google File System (GFS) Hadoop Distributed File System (HDFS) Distributed File Systems Replication Spanner Distributed Database System Paxos Replication

More information

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences Performance and Forgiveness June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences Margo Seltzer Architect Outline A consistency primer Techniques and costs of consistency

More information

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

Goal of the presentation is to give an introduction of NoSQL databases, why they are there. 1 Goal of the presentation is to give an introduction of NoSQL databases, why they are there. We want to present "Why?" first to explain the need of something like "NoSQL" and then in "What?" we go in

More information

CS November 2017

CS November 2017 Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account

More information

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju Chapter 4: Distributed Systems: Replication and Consistency Fall 2013 Jussi Kangasharju Chapter Outline n Replication n Consistency models n Distribution protocols n Consistency protocols 2 Data Replication

More information

Consistency in Distributed Storage Systems. Mihir Nanavati March 4 th, 2016

Consistency in Distributed Storage Systems. Mihir Nanavati March 4 th, 2016 Consistency in Distributed Storage Systems Mihir Nanavati March 4 th, 2016 Today Overview of distributed storage systems CAP Theorem About Me Virtualization/Containers, CPU microarchitectures/caches, Network

More information

HDFS Architecture Guide

HDFS Architecture Guide by Dhruba Borthakur Table of contents 1 Introduction...3 2 Assumptions and Goals...3 2.1 Hardware Failure... 3 2.2 Streaming Data Access...3 2.3 Large Data Sets...3 2.4 Simple Coherency Model... 4 2.5

More information

GFS: The Google File System. Dr. Yingwu Zhu

GFS: The Google File System. Dr. Yingwu Zhu GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can

More information

Replication in Distributed Systems

Replication in Distributed Systems Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over

More information

Large-Scale Data Stores and Probabilistic Protocols

Large-Scale Data Stores and Probabilistic Protocols Distributed Systems 600.437 Large-Scale Data Stores & Probabilistic Protocols Department of Computer Science The Johns Hopkins University 1 Large-Scale Data Stores and Probabilistic Protocols Lecture 11

More information

Data Management in the Cloud. Tim Kraska

Data Management in the Cloud. Tim Kraska Data Management in the Cloud Tim Kraska Montag, 22. Februar 2010 Systems Group/ETH Zurich MILK? [Anology from IM 2/09 / Daniel Abadi] 22.02.2010 Systems Group/ETH Zurich 2 Do you want milk? Buy a cow High

More information

Google File System. Arun Sundaram Operating Systems

Google File System. Arun Sundaram Operating Systems Arun Sundaram Operating Systems 1 Assumptions GFS built with commodity hardware GFS stores a modest number of large files A few million files, each typically 100MB or larger (Multi-GB files are common)

More information

Google File System, Replication. Amin Vahdat CSE 123b May 23, 2006

Google File System, Replication. Amin Vahdat CSE 123b May 23, 2006 Google File System, Replication Amin Vahdat CSE 123b May 23, 2006 Annoucements Third assignment available today Due date June 9, 5 pm Final exam, June 14, 11:30-2:30 Google File System (thanks to Mahesh

More information

Extreme Computing. NoSQL.

Extreme Computing. NoSQL. Extreme Computing NoSQL PREVIOUSLY: BATCH Query most/all data Results Eventually NOW: ON DEMAND Single Data Points Latency Matters One problem, three ideas We want to keep track of mutable state in a scalable

More information

Staggeringly Large File Systems. Presented by Haoyan Geng

Staggeringly Large File Systems. Presented by Haoyan Geng Staggeringly Large File Systems Presented by Haoyan Geng Large-scale File Systems How Large? Google s file system in 2009 (Jeff Dean, LADIS 09) - 200+ clusters - Thousands of machines per cluster - Pools

More information

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University CS 555: DISTRIBUTED SYSTEMS [REPLICATION & CONSISTENCY] Frequently asked questions from the previous class survey Shrideep Pallickara Computer Science Colorado State University L25.1 L25.2 Topics covered

More information

GFS: The Google File System

GFS: The Google File System GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one

More information