CS535 Big Data Fall 2017 Colorado State University 11/7/2017 Sangmi Lee Pallickara Week 12- A.

Similar documents
Distributed Hash Tables

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

Chord : A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

Peer to Peer I II 1 CS 138. Copyright 2015 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved.

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

Chord: A Scalable Peer-to-peer Lookup Service For Internet Applications

CS535 Big Data Fall 2017 Colorado State University 11/2/2017 Sangmi Lee Pallickara Week 11- B.

P2P: Distributed Hash Tables

Distributed File Systems: An Overview of Peer-to-Peer Architectures. Distributed File Systems

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

LECT-05, S-1 FP2P, Javed I.

CSE 486/586 Distributed Systems

L3S Research Center, University of Hannover

11/12/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data FALL 2018 Colorado State University

DISTRIBUTED SYSTEMS CSCI 4963/ /4/2015

A Framework for Peer-To-Peer Lookup Services based on k-ary search

Semester Thesis on Chord/CFS: Towards Compatibility with Firewalls and a Keyword Search

Ch06. NoSQL Part III.

L3S Research Center, University of Hannover

Peer-to-Peer (P2P) Systems

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

: Scalable Lookup

EECS 498 Introduction to Distributed Systems

Diminished Chord: A Protocol for Heterogeneous Subgroup Formation in Peer-to-Peer Networks

12/5/16. Peer to Peer Systems. Peer-to-peer - definitions. Client-Server vs. Peer-to-peer. P2P use case file sharing. Topics

Distributed Data Analytics Partitioning

CS 347 Parallel and Distributed Data Processing

Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications

ExaminingCassandra Constraints: Pragmatic. Eyes

Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

Distributed Hash Tables Chord and Dynamo

Distributed Hash Table

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

CIS 700/005 Networking Meets Databases

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

EE 122: Peer-to-Peer (P2P) Networks. Ion Stoica November 27, 2002

PEER-TO-PEER NETWORKS, DHTS, AND CHORD

Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. What We Want. Today s Question. What We Want. What We Don t Want C 1

Quasi-Chord: physical topology aware structured P2P network

Page 1. Key Value Storage"

Peer-to-Peer Systems and Distributed Hash Tables

Content Overlays. Nick Feamster CS 7260 March 12, 2007

CS 655 Advanced Topics in Distributed Systems

ProRenaTa: Proactive and Reactive Tuning to Scale a Distributed Storage System

C 1. Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. Today s Question. What We Want. What We Want. What We Don t Want

CSE-E5430 Scalable Cloud Computing Lecture 10

Lecture 15 October 31

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

EFFICIENT ROUTING OF LOAD BALANCING IN GRID COMPUTING

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

FAQs Snapshots and locks Vector Clock

Part 1: Introducing DHTs

Worst-case running time for RANDOMIZED-SELECT

Architectures for Distributed Systems

Distributed Hash Tables

Telematics Chapter 9: Peer-to-Peer Networks

Distributed Hash Tables

Distributed Hash Tables: Chord

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

Diminished Chord: A Protocol for Heterogeneous Subgroup Formation in Peer-to-Peer Networks

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

Structured Peer-to-Peer Networks

Building a low-latency, proximity-aware DHT-based P2P network

CSE 124 Finding objects in distributed systems: Distributed hash tables and consistent hashing. March 8, 2016 Prof. George Porter

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

Simulations of Chord and Freenet Peer-to-Peer Networking Protocols Mid-Term Report

Distributed Hash Tables (DHT)

Peer to peer systems: An overview

Routing Table Construction Method Solely Based on Query Flows for Structured Overlays

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

6 Distributed data management I Hashing

Dynamic Load Sharing in Peer-to-Peer Systems: When some Peers are more Equal than Others

Effect of Links on DHT Routing Algorithms 1

Hash Table and Hashing

Scaling Out Key-Value Storage

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

Cycloid: A constant-degree and lookup-efficient P2P overlay network

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

CS 223 Final Project CuckooRings: A Data Structure for Reducing Maximum Load in Consistent Hashing

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

Securing Chord for ShadowWalker. Nandit Tiku Department of Computer Science University of Illinois at Urbana-Champaign

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

5. Hashing. 5.1 General Idea. 5.2 Hash Function. 5.3 Separate Chaining. 5.4 Open Addressing. 5.5 Rehashing. 5.6 Extendible Hashing. 5.

HASH TABLES cs2420 Introduction to Algorithms and Data Structures Spring 2015

CS 43: Computer Networks BitTorrent & Content Distribution. Kevin Webb Swarthmore College September 28, 2017

08 Distributed Hash Tables

Today s topics. FAQs. Topics in BigTable. This material is built based on, CS435 Introduction to Big Data Spring 2017 Colorado State University

Chapter 6 PEER-TO-PEER COMPUTING

CSCI-1680 P2P Rodrigo Fonseca

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Analyzing the Chord Peer-to-Peer Network for Power Grid Applications

A Structured Overlay for Non-uniform Node Identifier Distribution Based on Flexible Routing Tables

Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

Searching for Shared Resources: DHT in General

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman

Transcription:

CS Big Data Fall Colorado State University http://www.cs.colostate.edu/~cs // Week - A // CS Big Data - Fall Week -A- CS BIG DATA FAQs PA deadline has been extended (/) PART. SCALABLE FRAMEWORKS FOR REAL-TIME BIG DATA ANALYTICS. SERVING LAYER: CASE STUDY-CASSANDRA Computer Science, Colorado State University http://www.cs.colostate.edu/~cs // CS Big Data - Fall Week -A- // CS Big Data - Fall Week -A- Today s topics Cassandra Partitioning. Serving layer: Case study-cassandr Partitioning Consistent Hashing // CS Big Data - Fall Week -A- Non-consistent hashing vs. consistent hashing When a hash table is resized Non-consistent hashing algorithm requires re-hash of the complete table Consistent hashing algorithm requires only partial rehash of the table // CS Big Data - Fall Week -A- Consistent hashing (/) Identifier circle with m = A Consistent hash function assigns each node and key an m-bit identifier using a hashing function B m-bit Identifier: m identifiers m has to be big enough to make the probability of two nodes or keys hashing to the same identifier negligible C Hashing value of IP address

CS Big Data Fall Colorado State University http://www.cs.colostate.edu/~cs // Week - A // CS Big Data - Fall Week -A- // CS Big Data - Fall Week -A- Consistent hashing (/) Consistent hashing assigns keys to nodes: A Key k will be assigned to the first node whose identifier is equal to or follows k Identifier: m identifiers in the identifier space B Machine B is the successor node of key. successor () = Consistent hashing (/) A B C Key will be stored in machine C successor() = Key will be stored in machine successor() = If machine C leaves circle, Successor() will point to A C If machine N joins circle, successor() will point to N New node N // CS Big Data - Fall Week -A-8 // CS Big Data - Fall Week -A-9 Scalable Key location In consistent hashing: Each node need only be aware of its successor node on the circle Queries can be passed around the circle via these successor pointers until it finds the resource What is the disadvantage of this scheme? It may require traversing all N nodes to find the appropriate mapping. Serving layer: Case study-cassandra Partitioning Consistent Hashing: Chord // CS Big Data - Fall Week -A- This material is built based on Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan.. Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communications (SIGCOMM '). ACM, New York, NY, USA, 9-. DOI=http://dx.doi.org/./89.8 // CS Big Data - Fall Week -A- Example of use Apache Cassandra s partitioning scheme Couchbase Openstack s object storage service Swift Akamai Content delivery network Data partitioning in Voldemort Partitioning component of Amazon s storage system Dynamo

CS Big Data Fall Colorado State University http://www.cs.colostate.edu/~cs // Week - A // CS Big Data - Fall Week -A- Scalable Key location in Chord Let m be the number of bits in the key/node identifiers Each node n, maintains, A routing table with (at most ) m entries Called the finger table The i th entry in the table at node n, contains the identity of the first node, s. Succeeds n by at least i- on the identifier circle i.e. s = successor (n+ i- ), where i m (and all arithmetic is modulo m ) // CS Big Data - Fall Week -A- Definition of variables for node n, using m-bit identifiers finger[i]. start = (n+ i- ) mod m, k m finger[i]. interval = [finger[i].start, finger[i+].start) finger[i]. node = first node n.finger[i].start successor = the next node of the identifier circle predecessor= the previous node on the identifier circle The i th entry finger of node n // CS Big Data - Fall Week -A- // CS Big Data - Fall Week -A- The Chord identifier The IP address of the relevant node First finger of n is its immediate successor on the circle Clockwise! s [,) [,) [,) [,) [,) [,) [,) [,) [,) // CS Big Data - Fall Week -A- Lookup process(/) Each node stores information about only a small number of other nodes A node s finger table generally does not contain enough information to determine the successor of an arbitrary key k What happens when a node n does not know the successor of a key k? If n finds a node whose ID is close than its own to k, that node will know more about the identifier circle in the region of k than n does // CS Big Data - Fall Week -A- Lookup process(/) First, check the data is stored in n If it is, return the data Otherwise, n searches its finger table for the node j Whose ID most immediately precedes k Ask j for the node it knows whose ID is closest to k Do not overshoot!

CS Big Data Fall Colorado State University http://www.cs.colostate.edu/~cs // Week - A // CS Big Data - Fall Week -A-8 Lookup process(/) [,) [,) [,). Node asks node to find successor of. Successor of is. Request comes into node to find the successor of identifier.. Node wants to find the successor of identifier. Identifier belongs to [,). Check succ: [,) [,) [,) [,) [,) [,) // CS Big Data - Fall Week -A-9 Lookup process: example [,) [,) [,). Node asks node to find successor of. Successor of is. Identifier belongs to [,). Check succ:. Request comes into node(machine) to find the successor of id.. Node wants to find the successor of identifier [,) [,) [,) [,) [,) [,) // CS Big Data - Fall Week -A- // CS Big Data - Fall Week -A- Lookup process: example [,) [,) [,). Node asks node to find successor of. Machine is using identifier as well.à succ is.. Request comes into node.. Node wants to find the successor of identifier. Identifier belongs to [,). Check succ: [,) [,) [,) [,) [,) [,) Theorem. With high probability (or under standard hardness assumptions), the number of nodes that must be contacted to find a successor in an N-node network is O(logN) Proof Suppose that node n tries to resolve a query for the successor of k. Let p be the node that immediately precedes k. We analyze the number of steps to reach p. If n p, then n forwards its query to the closest predecessor of k in its finger table. ( i steps) Node k will finger some node f in this interval. The distance between n and f is at least i-. // CS Big Data - Fall Week -A- Proof continued f and p are both in n s i th finger interval, and the distance between them is at most i-. This means f is closer to p than to n or equivalently Distance from f to p is at most half of the distance from n to p If the distance between the node handling the query and the predecessor p halves in each step, and is at most m Within m steps the distance will be (you have arrived at p) The number of forwardings necessary will be O(logN) After log N forwardings, the distance between the current query node and the key k will be reduced at most m /N The average lookup time is ½logN // CS Big Data - Fall Week -A- Requirements in node Joins In a dynamic network, nodes can join (and leave) at any time. Each node s successor is correctly maintained. For every key k, node successor(k) is responsible for k

CS Big Data Fall Colorado State University http://www.cs.colostate.edu/~cs // Week - A // CS Big Data - Fall Week -A- // CS Big Data - Fall Week -A- Tasks to perform node join. Initialize the predecessor and fingers of node n. Update the fingers and predecessors of existing nodes to reflect the addition of n. Notify the higher layer software so that it can transfer state (e.g. values) associated with keys that node n is now responsible for #define successor finger[].node // node n joins the network // n is an arbitrary node in the network n.join(n ) if (n ) init_finger_table(n ); update_others(); // move keys in (predessor, n] from successor else // if n is going to be the only node in the network for i = to m finger[i].node = n; prodecessor = n; // CS Big Data - Fall Week -A- n.find_successor(id) n =find_predecessor(id); return n.successor; n.find_predecessor(id) n =n; while(id is NOT in (n, n.successor])) n = n.closest_preceding_finger(id); return n ; n.closest_preceding_finger(id) for i = m down to if(finger[i].node is in (n, id)) return finger[i].node; return n; // CS Big Data - Fall Week -A- Step: Initializing fingers and predecessor (/) New node n learns its predecessor and fingers by asking any arbitrary node in the network n to look them up n.init_finger_table(n ) finger[].node = n.find_successor(finger[].start); predecessor = successor.predecessor; successor.predecessor = n; for i= to m- if(finger[i+].start is in [n, n.finger[i].node)) finger[i+].node = finger[i].node; else finger[i+].node= n.find_successor(finger[i+].start); // CS Big Data - Fall Week -A-8 // CS Big Data - Fall Week -A-9 Join (After init_finger_table(n )) [,) [,) [,) [,) [,) [,) Join (After update_others()) [,) [,) [,) [,) [,) [,) [,) [,) [,) [,) [,) [,) [,) [,) [,) [,) [,) [,)

CS Big Data Fall Colorado State University http://www.cs.colostate.edu/~cs // Week - A // CS Big Data - Fall Week -A- Step :Initializing fingers and predecessor (/) Naïve run for find_successor will take O(logN) For m finger entries O(mlogN) How can we optimize this? Check if i th node is also correct for the (i+) th node Ask immediate neighbor and copy of its complete finger table and its predecessor New node n can use these table as hints to help it find the correct values // CS Big Data - Fall Week -A- Updating fingers of existing nodes Node n will be entered into the finger tables of some existing nodes n.update_others() for i= to m p = find_predecessor(n- i- ); p.update_finger_table(n,i); p.update_finger_table(s,i) if (s is in [n, finger[i].node)) finger[i].node = s; p = predecessor;//get first node preceding n p.update_finger_table(s,i); // CS Big Data - Fall Week -A- // CS Big Data - Fall Week -A- Transferring keys Node n will become the i th finger of node p if and only if, p precedes n by at least i- and the i th finger of node p succeeds n Move responsibility for all the keys for which node n is now the successor It involves moving the data associated with each key to the new node The first node p that can meet these two condition Immediate predecessor of n- i- For the given n, the algorithm starts with the finger of node n Continues to walk in the counter-clock-wise direction on the identifier circle Node n can become the successor only for keys that were previously the responsibility of the node immediately following n n only needs to contact that one node to transfer responsibility for all relevant keys Number of nodes that need to be updated is O(logN) on the average // CS Big Data - Fall Week -A- Example If you have following data, Name Age Car Gender Jim Camaro M Carol BMW F Jonny M Suzy 9 F Cassandra assigns a hash value to each partition key Partition Key Mumer Hash value Jim -8 Carol 898 Jonny -888 Suzy 8898 // CS Big Data - Fall Week -A- Cassandra cluster with nodes Node D 8889 to 988 Carol 898 Data Center ABC Node C Node A -9888 to -8889 Node B Jonny -888-8889 to - Jim -8 Suzy 8898 to 8889

CS Big Data Fall Colorado State University http://www.cs.colostate.edu/~cs // Week - A // CS Big Data - Fall Week -A- // CS Big Data - Fall Week -A- Partitioning Partitioner is a function for deriving a token representing a row from its partition key, typically by hashing Each row of data is then distributed across the cluster by value of the token Apache Cassandra Partitioning Partitioners Read and write requests to the cluster are also evenly distributed Each part of the hash range receives an equal number of rows on average Cassandra offers three partitioners MurmurPartitioner (default): uniformly distributes data across the cluster based on MurmurHash hash values. RandomPartitioner: uniformly distributes data across the cluster based on MD hash values. ByteOrderedPartitioner: keeps an ordered distribution of data lexically by key bytes // CS Big Data - Fall Week -A-8 // CS Big Data - Fall Week -A-9 MurmurPartitioner Testing with Million keys Murmur hash is a non-cryptographic hash function Created by Austin Appleby in 8 Multiply (MU) and Rotate (R) Current version Murmur yields or 8-bit hash value Murmur has low bias of under.% with the Avalanche analysis // CS Big Data - Fall Week -A- // CS Big Data - Fall Week -A- Measuring the quality of hash function Comparison between hash functions Hash function quality Where, b j is the number of items in j-th slot. n is the total number of items m is the number of slots A. V. Aho, M. S. Lam, R. Sethi and J. D Ullman, Compilers, Principles, Techniques, and Tools, Pearson Education, Inc. http://www.strchr.com/hash_functions

CS Big Data Fall Colorado State University http://www.cs.colostate.edu/~cs // Week - A // CS Big Data - Fall Week -A- // CS Big Data - Fall Week -A- Avalanche Analysis for hash functions Indicates how well the hash function mixes the bits of the key to produce the bits of the hash Whether a small change in input causes a significant change in the output Whether or not it achieves avalanche P(Output bit i changes Input bit j changes) =. for all i, j If we keep all of the input bits the same, and flip exactly bit Each of our hash function s output bits changes with probability ½ The hash is biased If the probability of an input bit affecting an output bit is greater than or less than % Large amounts of bias indicate that keys differing only in the biased bits may tend to produce more hash collisions than expected. http://research.neustar.biz////choosing-a-good-hash-function-part-/ // CS Big Data - Fall Week -A- RandomPartitioner RandomPartitioner was the default partitioner prior to Cassandra. Uses MD to - // CS Big Data - Fall Week -A- ByteOrderPartitioner This partitioner orders rows lexically by key bytes The ordered partitioner allows ordered scans by primary key If your application has user names as the partition key, you can scan rows for users whose names fall between Jake and Joe Disadvantage of this partitioner Difficult load balancing Sequential writes can cause hot spots Uneven load balancing for multiple tables 8