Distributed Algorithms. Partha Sarathi Mandal Department of Mathematics IIT Guwahati

Similar documents
Lecture 1: Introduction to distributed Algorithms

Introduction to Distributed Systems

Self Stabilization. CS553 Distributed Algorithms Prof. Ajay Kshemkalyani. by Islam Ismailov & Mohamed M. Ali

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

CSE 486/586 Distributed Systems

Replication in Distributed Systems

Distributed Algorithms Models

Reliable Distributed System Approaches

Arvind Krishnamurthy Fall Collection of individual computing devices/processes that can communicate with each other

Distributed Algorithms 6.046J, Spring, 2015 Part 2. Nancy Lynch

Introduction to Distributed Systems Seif Haridi

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

Distributed Algorithms Reliable Broadcast

Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors. Michel Raynal, Julien Stainer

Initial Assumptions. Modern Distributed Computing. Network Topology. Initial Input

Replicated State Machine in Wide-area Networks

Chapter 3 Part 2 Switching and Bridging. Networking CS 3470, Section 1

Consensus Problem. Pradipta De

DISTRIBUTED COMPUTER SYSTEMS

Distributed algorithms

Basic vs. Reliable Multicast

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Distributed Systems. Chapman & Hall/CRC. «H Taylor S* Francis Croup Boca Raton London New York

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?

Concepts of Distributed Systems 2006/2007

Internetworking. Problem: There is more than one network (heterogeneity & scale)

System models for distributed systems

SCALABLE CONSISTENCY AND TRANSACTION MODELS

PROCESSES AND THREADS

Crossbar switch. Chapter 2: Concepts and Architectures. Traditional Computer Architecture. Computer System Architectures. Flynn Architectures (2)

Distributed Systems COMP 212. Revision 2 Othon Michail

21. Distributed Algorithms

MENCIUS: BUILDING EFFICIENT

Exam Principles of Distributed Computing

Distributed Algorithms Benoît Garbinato

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

Intuitive distributed algorithms. with F#

MODELS OF DISTRIBUTED SYSTEMS

CSE Traditional Operating Systems deal with typical system software designed to be:

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

Coordination and Agreement

Chapter 5: Distributed Systems: Fault Tolerance. Fall 2013 Jussi Kangasharju

Introduction to Distributed Systems

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

Distributed Systems Question Bank UNIT 1 Chapter 1 1. Define distributed systems. What are the significant issues of the distributed systems?

What is a distributed system?

Distributed Systems. Communica3on and models. Rik Sarkar Spring University of Edinburgh

Coordination 2. Today. How can processes agree on an action or a value? l Group communication l Basic, reliable and l ordered multicast

Clock Synchronization. Synchronization. Clock Synchronization Algorithms. Physical Clock Synchronization. Tanenbaum Chapter 6 plus additional papers

Michel Raynal. Distributed Algorithms for Message-Passing Systems

Distributed Systems. coordination Johan Montelius ID2201. Distributed Systems ID2201

CSE 5306 Distributed Systems. Fault Tolerance

System Models for Distributed Systems

Coordination and Agreement

TDDD82 Secure Mobile Systems Lecture 1: Introduction and Distributed Systems Models

CS 3640: Introduction to Networks and Their Applications

Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms

Failures, Elections, and Raft

Distributed Computing over Communication Networks: Leader Election

CMSC Computer Architecture Lecture 15: Memory Consistency and Synchronization. Prof. Yanjing Li University of Chicago

Distributed Systems. Communica3on and models. Rik Sarkar 2015/2016. University of Edinburgh

The Flooding Time Synchronization Protocol

MODELS OF DISTRIBUTED SYSTEMS

Chapter 17: Distributed Systems (DS)

CS October 2017

Chapter 8 Fault Tolerance

An Introduction to Software Architecture By David Garlan & Mary Shaw 94

Distributed Systems - I

Non-blocking Array-based Algorithms for Stacks and Queues. Niloufar Shafiei

SENSOR-MAC CASE STUDY

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems

Realistic and Efficient Multi-Channel Communications in WSN

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju

740: Computer Architecture Memory Consistency. Prof. Onur Mutlu Carnegie Mellon University

Review. Error Detection: CRC Multiple access protocols. LAN addresses and ARP Ethernet. Slotted ALOHA CSMA/CD

Today: Fault Tolerance

PROCESS SYNCHRONIZATION

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q

Graph Processing. Connor Gramazio Spiros Boosalis

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Distributed Routing. EECS 228 Abhay Parekh

Frequently asked questions from the previous class survey

Wireless Sensor Networks: Clustering, Routing, Localization, Time Synchronization

Fault Tolerance. Basic Concepts

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Concurrent Programming: Algorithms, Principles, and Foundations

Hubs, Bridges, and Switches (oh my) Hubs

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 19: Networks and Distributed Systems

Administrative Stuff. We are now in week 11 No class on Thursday About one month to go. Spend your time wisely Make any major decisions w/ Client

Architectural Styles - Finale

Building Consistent Transactions with Inconsistent Replication

DISTRIBUTED SYSTEMS [COMP9243] Lecture 5: Synchronisation and Coordination (Part 2) TRANSACTION EXAMPLES TRANSACTIONS.

Link layer: introduction

Computer Networks Medium Access Control. Mostafa Salehi Fall 2008

DISTRIBUTED SYSTEMS [COMP9243] Lecture 5: Synchronisation and Coordination (Part 2) TRANSACTION EXAMPLES TRANSACTIONS.

The Timed Asynchronous Distributed System Model By Flaviu Cristian and Christof Fetzer

Computing with Infinitely Many Processes under assumptions on concurrency and participation -M.Merritt&G.Taubenfeld. Dean Christakos & Deva Seetharam

Replication and Consistency. Fall 2010 Jussi Kangasharju

Transcription:

Distributed Algorithms Partha Sarathi Mandal Department of Mathematics IIT Guwahati

Thanks to Dr. Sukumar Ghosh for the slides

Distributed Algorithms Distributed algorithms for various graph theoretic problems have numerous applications in distributed computing system. What is a distributed computing system? The topology of a distributed system is represented by a graph where the nodes represent processes, and the links represent communication channels.

Topology of a DS is represented by a graph 2 5 1 4 8 3 7 10 6 9 A network of processes. The nodes are processes, and the edges are communication channels. 2

Examples Server Clients Client-server model Server is the coordinator Peer-to-peer model No unique coordinator

Parallel vs Distributed In both parallel and distributed systems, the events are partially ordered. The distinction between parallel and distributed is not always very clear. In parallelsystems, the primarily issues are speed-upandincreased data handling capability. Indistributedsystems the primary issues are fault-tolerance, synchronization, scalability, etc. Grid Parallel P2P Distributed

Parallel vs Distributed (Multiple processors) Tightly coupled systems Parallel processing systems. CPU CPU Shared Memory CPU Interconnected Hardware CPU Loosely coupled systems Local memory Local memory Local memory Local memory Distributed computing systems. CPU CPU CPU Communication Network CPU

Examples Large networks are very commonplace these days. Think of the world wide web. Other examples are: - Social Networks - Sensor networks - BitTorrent for downloading video / audio - Skype for making free audio and video communication - Computational grids - Network of mobile robots 4

Why distributed systems Geographic distribution of processes Resource sharing (example: P2P networks, grids) Computation speed up (as in a grid) Fault tolerance etc. 9

Important services Internet banking Web search Net meeting Distance education Video distribution Internet auction Google earth Skype Publish subscribe 10

Important issues Knowledge is local Clocks are not synchronized No shared address space Topology and routing : everything is dynamic Scalability: all solutions do not scale well Processes and links fail: Fault tolerance 11

Some common subproblems Leader election Mutual exclusion Time synchronization Global state collection Replica management Consensus Coloring Self-stabilization 12

Models We will motive about distributed systems using models. There are many dimensions of variability in distributed systems. Examples: types of processors inter-process communication mechanisms timing assumptions failure classes security features, etc

Models Models are simple abstractions that help to overcome the variability --abstractions that preserve the essential features, but hide the implementation details and simplify writing distributed algorithms for problem solving. Optical or radio communication? PC or Mac? Are clocks perfectly synchronized? algorithms models Implementation of models Real hardware

Understanding Models How models help algorithms models Implementation of models Real hardware

Modeling Communication System topology is a graph G = (V, E), where V = set of nodes (sequential processes) E = set of edges (links or channels, bi/unidirectional). Four types of actions by a process: internal action input action communication action output action

Example: A Message Passing Model A Reliable FIFO Channel P Axiom 1. Message msent message mreceived Axiom 2. Message propagation delay is arbitrary but finite. Axiom 3. m 1 sent before m 2 m 1 received before m 2. P Send(m 1 ) Send(m 2 ) Q Q time receive(m 1 ) receive(m 2 )

Example: Shared memory model Address spaces of processes overlap M1 M2 1 3 2 Processes 4 Concurrent operations on a shared variable are serialized

Variations of shared memory models State reading model Each process can read the states of its neighbors 0 1 2 3 Link register model 0 1 2 Each process can read from and write to adjacent registers. The entire local state is not shared. 3

Modeling wireless networks Communication via broadcast Limited range 1 6 Dynamic topology 0 3 5 Collision of broadcasts 2 4 (handled by CSMA/CA) (a) Request To Send RTS RTS 1 6 0 3 5 CTS 2 4 Request To Send Clear To Send (b)

Synchrony vs. Asynchrony Synchronous clocks Synchronous processes Synchronous channels Synchronous message-order Synchronous communication Physical clocks are synchronized Lock-step synchrony Bounded delay First-in first-out channels Communication via handshaking Send & receive can be blocking or non-blocking Postal communication is asynchronous Telephone communication is synchronous Synchronous communication or not? (1) Remote Procedure Call, (2) Email

Weak vs. Strong Models One object (or operation) of a strong model= More than one simpler objects (or simpler operations) of a weaker model. Often, weaker modelsare synonymous with fewer restrictions. One can add layers (additional restrictions) to create a stronger model from weaker one. Examples High level language is stronger than assembly language. Asynchronous is weaker than synchronous (communication). Bounded delay is stronger than unbounded delay (channel)

Model transformation Stronger models -simplify reasoning, but -needs extra work to implement Can model Xbe implemented using model Y? is an interesting question in computer science. Weaker models -are easier to implement. -Have a closer relationship with the real world Sample exercises 1. Non-FIFO to FIFO channel 2. Message passing to shared memory 3. Non-atomic broadcast to atomic broadcast

Non-FIFO to FIFO channel FIFO = First-In-First-Out P m2 m3 m4 m1 Q Sends out m1, m2, m3, m4,

Non-FIFO to FIFO channel FIFO = First-In-First-Out P m2 m3 m4 m1 Q Sends out m1, m2, m3, m4, 7 6 5 4 3 buffer 2 1

Non-FIFO to FIFO channel {Sender process P} {Receiver process Q} vari:integer{initially0} var k:integer{initially0} buffer: buffer[0.. ] of msg {initially k: buffer[k] = empty repeat repeat {STORE} send m[i],i to Q; receive m[i],i from P; i:= i+1 store m[i] into buffer[i]; forever {DELIVER} while buffer[k] begin empty do deliver content of buffer[k]; buffer[k]:=empty;k:=k+1; end forever Needs unbounded buffer & unbounded sequence no THIS IS BAD

Observations Now solve the same problem on a model where a) The propagation delay has a known upper bound of T. b) The messages are sent out @ r per unit time. c) The messages are received at a rate faster than r. The buffer requirement drops to r.t. (Lesson) Stronger model helps. Question. Can we solve the problem using bounded buffer space if the propagation delay is arbitrarily large?

Example 1 second window sender First message Last message receiver

Implementing Shared memory using Message passing {ReadXbyprocessi}:readx[i] {WriteX:=vbyprocessi} - x[i]:=v; - Atomically broadcast v to everyotherprocessj(j i); - After receiving broadcast, processj(j i)setsx[j]tov. Understand the significance of atomic operations. It is not trivial, but is very important in distributed systems. memory X 0 1 2 3 processes (a) x[0] x[1] x[3] (b) Implementation Atomically broadcast is far from trivial x[2] Atomic = all or nothing

Non-atomic to atomic broadcast Atomic broadcast = either everybody or nobody receives {process i is the sender} forj = 1 to N-1 (j i) sendmessage mto neighbor [j] (Easy!) Now include crash failureas a part of our model. What if the sender crashes at the middle? How to implement atomic broadcast in presence of crash?

Complexity Measures

Common measures Space complexity How much space is needed per process to run an algorithm? (measured in terms of N, the size of the network) Time complexity What is the max. time (number of steps) needed to complete the execution of the algorithm? Message complexity How many message are needed to complete the execution of the algorithm?

An example Consider broadcasting in an n-cube (here n=3) N = total number of processes = 2 n = 8 Each process j > 0 has a variable x[j], whose initial value is arbitrary source

Broadcasting using messages {Process 0} m.value := x[0]; send m to all neighbors {Process i > 0} repeat receive m {m contains the value}; if m is received for the first time then x[i] := m.value; send x[i] to each neighbor j > i else discard m end if Forever What is the (1) Message & time complexities (2) space complexity per process? m m m 1/2 (N log 2 N) & log 2 N log 2 N

Broadcasting using shared memory {Process 0} x[0] := v {Process i > 0} repeat if a neighbor j < i : x[i] x[j] then x[i] := x[j] (PULL DATA) {this is a step} else skip end if forever What is the time complexity? (i.e. how many steps are needed?) Arbitrarily large!

Broadcasting using shared memory Now, use large atomicity, where in one step, a process j reads the states of ALL neighbors with smaller id, and updates x[j] only when these are equal, but different from x[j]. What is the time complexity? How many steps are needed?

Time complexity in rounds Rounds are truly defined for synchronous systems. An asynchronous round consists of a number of steps where every process (including the slowest one) takes at least one step. How many rounds will you need to complete the broadcast using the large atomicity model? An easier concept is that of synchronous processes executing their steps in lock-step synchrony

Graph Algorithms vs Distributed Algorithms

Graph Algorithms Why graph algorithms? Many problems in DS can be modeled as graph problems. Note that The topology of a distributed system is a graph Routing table computation uses the shortest path algorithm Efficient broadcasting uses a spanning tree Maxflow algorithm determines the maximum flow between a pair of nodes in a graph, etc. Reuse of frequencies in wireless networks ( no interference) uses Vertex coloring