CSE 5306 Distributed Systems. Naming

Similar documents
CSE 5306 Distributed Systems

Lecture 4 Naming. Prof. Wilson Rivera. University of Puerto Rico at Mayaguez Electrical and Computer Engineering Department

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 5 Naming

Naming. Distributed Systems IT332

Naming in Distributed Systems

Naming. Naming. Naming versus Locating Entities. Flat Name-to-Address in a LAN

殷亚凤. Naming. Distributed Systems [5]

Chapter 5 Naming. Names, Identifiers, and Addresses

Naming. To do. q What s in a name q Flat naming q Structured naming q Attribute-based naming q Next: Content distribution networks

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Naming WHAT IS NAMING? Name: Entity: Slide 3. Slide 1. Address: Identifier:

Chapter 5 Naming (2)

Chapter 3: Naming Page 38. Clients in most cases find the Jini lookup services in their scope by IP

Parallelism. Master 1 International. Andrea G. B. Tettamanzi. Université de Nice Sophia Antipolis Département Informatique

Systèmes Distribués. Master MIAGE 1. Andrea G. B. Tettamanzi. Université de Nice Sophia Antipolis Département Informatique

Naming. Chapter 4. Naming (1) Name resolution allows a process to access a named entity. A naming system is necessary.

Distributed Naming. EECS 591 Farnam Jahanian University of Michigan. Reading List

Naming. Naming entities

Chapter 5 Naming (2)

Naming in Distributed Systems

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

June Gerd Liefländer System Architecture Group Universität Karlsruhe (TH), System Architecture Group

Lecture 11: February 29

TECHNISCHE UNIVERSITEIT EINDHOVEN Faculteit Wiskunde en Informatica

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

New Topic: Naming. Differences in naming in distributed and non-distributed systems. How to name mobile entities?

Naming in Distributed Systems

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Computing Parable. New Topic: Naming

Today: Naming. Example: File Names

TECHNISCHE UNIVERSITEIT EINDHOVEN Faculteit Wiskunde en Informatica

Architectures for Distributed Systems

New Topic: Naming. Approaches

CSE 124 Finding objects in distributed systems: Distributed hash tables and consistent hashing. March 8, 2016 Prof. George Porter

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General

CRESCENDO GEORGE S. NOMIKOS. Advisor: Dr. George Xylomenos

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

DHT Overview. P2P: Advanced Topics Filesystems over DHTs and P2P research. How to build applications over DHTS. What we would like to have..

Content Overlays. Nick Feamster CS 7260 March 12, 2007

A Scalable Content- Addressable Network

9.1 Introduction 9.2 Name services and the DNS 9.3 Discovery services 9.6 Summary

Unit 8 Peer-to-Peer Networking

Venugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04

Distributed System: Definition

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

CSE 124: QUANTIFYING PERFORMANCE AT SCALE AND COURSE REVIEW. George Porter December 6, 2017

Assignment 5. Georgia Koloniari

Introduction to Peer-to-Peer Systems

Distributed Hash Table

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

TECHNISCHE UNIVERSITEIT EINDHOVEN Faculteit Wiskunde en Informatica

ICT 6544 Distributed Systems Lecture 7

LECT-05, S-1 FP2P, Javed I.

Internet Technology. 06. Exam 1 Review Paul Krzyzanowski. Rutgers University. Spring 2016

CS519: Computer Networks. Lecture 6: Apr 5, 2004 Naming and DNS

Internet Technology 3/2/2016

L3S Research Center, University of Hannover

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

EE 122: Peer-to-Peer (P2P) Networks. Ion Stoica November 27, 2002

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

A Survey of Peer-to-Peer Content Distribution Technologies

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

CS514: Intermediate Course in Computer Systems

Bayeux: An Architecture for Scalable and Fault Tolerant Wide area Data Dissemination

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Cloud Computing CS

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

Peer-to-Peer (P2P) Distributed Storage. Dennis Kafura CS5204 Operating Systems

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P?

Naming. Brighten Godfrey cs598pbg Sept slides 2010 by Brighten Godfrey unless otherwise noted

Distributed Information Processing

CS514: Intermediate Course in Computer Systems

Windows Server 2003 Network Administration Goals

DIVING IN: INSIDE THE DATA CENTER

Application Layer Protocols

08 Distributed Hash Tables

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

THE DATACENTER AS A COMPUTER AND COURSE REVIEW

Introduction to P2P Computing

Peer-to-Peer Systems. Chapter General Characteristics

L3S Research Center, University of Hannover

INF5070 media storage and distribution systems. to-peer Systems 10/

MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration. Chapter 5 Introduction to DNS in Windows Server 2008

Switching and Forwarding

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

Early Measurements of a Cluster-based Architecture for P2P Systems

Naming. CS 475, Spring 2018 Concurrent & Distributed Systems. Slides by Luís Pina

A Decentralized Content-based Aggregation Service for Pervasive Environments

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

Computer Network laboratory (2015) Pattern TE Computer 1 (5)

Telematics Chapter 9: Peer-to-Peer Networks

Transcription:

CSE 5306 Distributed Systems Naming 1

Naming Names play a critical role in all computer systems To access resources, uniquely identify entities, or refer to locations To access an entity, you have resolve the name and find the entity Name resolution provided by the naming system In a distributed system, the naming system itself is implemented across multiple machines Efficiency and scalability are the keys 2

Addresses To access an entity, we need the access point, which is special entity The name of an access point is an address An entity may have multiple access points, and its access point may change Thus, the address of an access point should not be used to name the entity E.g., each person has multiple phone numbers to reach him, and these number may be re-assigned to another person Therefore, what we need is a name for an entity that is independent from its addresses i.e., a location-independent name 3

True Identifiers Are the names that are used to uniquely identify an entity in a distributed system True identifiers has the following property Each identifier refers to at most one entity Each entity referred to by at most one identifier An identifier always refers to the same entity (prohibits reusing an identifier) A simple comparison of two identifiers is sufficient to test if they refer to the same entity 4

The Central Theme of Naming How to resolve names and identifiers to addresses In principle, a naming system maintains a name-toaddress binding in the form of mapping table However, a centralized table in a large network is not going to work The name resolution as well as the table is is often distributed across multiple machines E.g., www.cse.uta.edu is divided into multiple parts, and several DNS servers are used during the resolution Will discuss the resolution of the following names Flat names, structured names, attribute-based names 5

Flat Names (Identifiers) An identifier is often a string of random bits does not contain any information on how to locate the access point of its associated entity Two simple solutions to locate the entity given an identifier Broadcasting and multicasting (e.g., ARP) Broadcast is expensive, multicast is not well supported Forwarding pointers When an entity moves, it leaves a pointer to where it went A popular approach to locate mobile entities 6

Forwarding Pointers Advantage: Dereferencing can be made transparent to client - follow the pointer chain Geographical scalability problems: Chain can be very long for highly mobile entities Long chains not fault tolerant High latency when dereferencing Need chain reduction mechanisms Update client's reference when the most recent location is found 7

Forwarding via Client-Server Stubs The principle of forwarding pointers using (client stub, server stub) pairs. 8

Chain Reduction via Shortcuts 9

Home-Based Approaches The principle of Mobile IP. 10

Issues with Home-Based Approaches Home address has to be supported as long as entity lives Home address is fixed - unnecessary burden if entity permanently moves Poor geographical scalability (the entity may be next to the client) 11

Distributed Hash Table In DHT-based system Chord, Each node has an m-bit random identifier Each entity has an m-bit random key An entity with key k is located on a node with the smallest identifier that satisfies id>=k, denoted as succ(k) The major task is key lookup, i.e., I.e., to resolve an m-bit key to the address of succ(k) Two approaches: linear approach and finger table The simplest form of chord does not consider network proximity, leading to performance problem 12

Key Lookup in Chord Resolving key 26 from node 1 and key 12 from node 28 in a Chord system. 13

Hierarchical Approaches (1) Hierarchical organization of a location service into domains, each having an associated directory node. 14

Hierarchical Approaches (2) An example of storing information of an entity having two addresses in different leaf domains. 15

Hierarchical Approaches (3) Looking up a location in a hierarchically organized location service. 16

Structured Naming Flat names are not convenient for humans to use As a result, naming systems often support structured names that are composed from simple, human-readable names E.g., file names, Internet domain names Structured names are often organized into what is called a name space A labeled, directed graph with two types of nodes, leaf node and directory node 17

Name Space A general naming graph with a single root node. 18

Name Resolution The process of looking up a name in a name space Name resolution can take place only if we know where and how to start A closure mechanism E.g., start from a well know root directory, or start from home directory Linking Aliases are commonly used in a name space An alias can be a hard link or a symbolic link 19

Symbolic Link The concept of a symbolic link explained in a naming graph. 20

Mounting (1) The process of merging different name spaces A common approach is to Let a directory node (mount point) store the identifier of a directory node (mounting point) from the foreign name space Information required to mount a foreign name space in a distributed system The name of an access protocol The name of the server The name of the mounting point in the foreign name space 21

Mounting (2) Mounting remote name spaces through a specific access protocol. 22

Implementation of a Name Space A name space is often implemented by name servers in LAN, a single name server is often good enough in large-scale distributed system, the implementation of a name space is often distributed over multiple name servers A name space for large-scale distributed systems is often organized hierarchically Global layer often stable, represents organizations or groups of organizations Administrational layer Represents groups of entities in a single organization or admin. unit Managerial layer Nodes often change frequently, e.g., hosts in a local network May be managed by system administrators or end users 23

Name Space Distribution (1) An example partitioning of the DNS name space, including Internet-accessible files, into three layers. 24

Name Space Distribution (2) A comparison between name servers for implementing nodes from a large-scale name space partitioned into a global layer, an administrational layer, and a managerial layer. 25

Implementing Name Resolution (1) The principle of iterative name resolution. 26

Implementing Name Resolution (2) The principle of recursive name resolution. 27

Recursive v.s. Iterative Recursive resolution demands more on each name server However, it has two advantages Caching is more effective than iterative name resolution Intermediate nodes can cache the result (faster look-up) With iterative solution, only the client can cache Overall communication cost can be reduced 28

Example: The Domain Name System The DNS name space is organized as a root tree Each node in this tree stores a collection of resource records 29

Decentralized DNS Implementation In standard hierarchical DNS implementation, higher-level nodes receives more requests than low-level nodes leading to a scalability problem Fully decentralized solution can avoid such scalability problem Map DNS names to keys and look them up in a distributed hash table The problem is that we lose the structure of the original names and make some operations difficult 30

Attribute-Based Naming As more information being made available, it becomes important to locate entities based on merely a description of what is needed Attribute-based naming Each entity is associated with a collection of attributes The naming system provides one or multiple entities that match a user s description Attribute-based naming systems are often known as directory services 31

Hierarchical Implementation LDAP A simple example of an LDAP directory entry using LDAP naming conventions. 32

Directory Information Tree (DIT) For a large-scale directory, the DIT is often distributed over several servers as in DNS Part of directory information tree. 33

Decentralized (DHT) Implementation Each path in attribute-value tree (AVT) produces a hash value and mapped to a DHT h1=hash(type-book), h2=hash(type-book-author) h3=hash(type-book-author-tolkien), h4=hash(type-book-title) h5=hash(type-book-title-lotr), h6=hash(genre-fantasy) 34

Ranged Query in DHT implementation Two phase approach Separate the name and the attribute in computing the hash value Phase 1: distribute attribute names in DHT Phase 2: for each name, partition the values into subranges and assign a single server for each subrange Drawbacks Updates may need to be send to multiple servers Load balancing between different subrange servers 35

Semantic Overlay Networks Construct an overlay network where each pair of neighbors are semantically proximal neighbors i.e., they have similar resources 36