11/7/2011. Networked embedded systems. The Vision for WSANs. Embedded systems

Size: px
Start display at page:

Download "11/7/2011. Networked embedded systems. The Vision for WSANs. Embedded systems"

Transcription

1 Networked embedded systems Principles of distributed computing for design of scalable and robust sensor actuator networks Vinod Kulathumani Dept. of Computer Science and Electrical Engineering West Virginia University Currently Embedded processors -part of a larger system Application known apriori Little flexibility in programming What if? embedded processors were connected preferably wireless? there was greater flexibility in programming? sensing and actuation capabilities were included? That s the vision for Sensor Actuator Networks Networked Embedded Systems Laboratory - Fundamentally a network of embedded systems Embedded systems The Vision for WSANs Found in variety of devices Aircraft, radar systems, nuclear and chemical plants Vehicles, TVs, camcorders, elevators > 90% of CPUs used for embedded devices Combine wireless networks with sensing / actuation Ubiquitous computing / pervasive computing Fine-grained monitoring and control of environment Network and interact with billions of embedded computers Reasons Wireless communication -no need for infrastructure setup Drop and play Nodes are built using off-the-shelf cheap components Feasible to deploy nodes densely 1

2 Enabling technology Emerging applications Powerful microprocessors Small form factor Low energy consumption Micro-sensors (MEMS, Materials, Circuits acceleration, vibration, gyroscope, tilt, motion magnetic, heat, pressure, temp, light, moisture, humidity, barometric chemical (CO, CO 2, radon, biological, micro-radar actuators (mirrors, motors, smart surfaces, micro-robots Combination of sensors with mobile devices Social networking Participatory urban sensing Assisted living health monitoring Communication short range, low bit-rate, CMOS radios Vehicular networks with variety of sensors Application category Monitoring type Challenges in monitoring based sensor networks Energy constraint : Nodes are battery powered Unreliable communication : Wireless, limited bandwidth Environmental monitoring Object tracking Unreliable sensors : False positives, negatives Ad hoc deployment : Pre-configuration inapplicable Infrastructure monitoring Body sensor networks Large scale networks : Algorithms should scale well Distributed execution : Difficult to debug & get it right Perimeter security Camera sensor networks Ease of use : All Scientists not programmers 2

3 Sensor networks for control applications Not simply monitoring events, objects Combined with actuation Traditional control applications Decouple information availability Control assumes information is instantaneously available What if information is transmitted over a sensor network? Losses, delays in information New tools needed for programming, reasoning about such systems Building blocks for Cyber-physical systems - recent buzzword! Example sensor actuator networks Robotic systems Self-configuring structures Robotic surgery Self-configuring table Autonomic vehicular platoons Use in UAV swarms Autonomous driving Google Car! Distributed vibration control Distributed illumination control, irrigation, process control Smart power grid Sensor networks for control applications We saw all these challenges for sensor networks Not simply monitoring events, objects Combined with actuation Traditional control applications Note Decouple information availability Applying control theory for network Control assumes information is instantaneously available systems has existed before (example: TCP congestion What if information is transmitted over a sensor network? Losses, delays in information This is control systems designed on top of networks New tools needed for programming, reasoning about such systems Building blocks for Cyber-physical systems - recent buzzword! Energy constraint Unreliable communication Unreliable sensors Ad hoc deployment Large scale networks Distributed execution Ease of use : Nodes are battery powered : Wireless, limited bandwidth, bursty traffic : False positives, negatives : Pre-configuration inapplicable : Algorithms should scale well : Difficult to debug & get it right : All Scientists not programmers 3

4 Add to these... Energy constraint Unreliable communication : Nodes are battery powered : Wireless, limited bandwidth, bursty traffic Unreliable sensors : False positives, negatives. A control application that sits on top Ad hoc Requires deployment information : guarantees Pre-configuration from network inapplicable below! 1. Fault-tolerant, Self stabilizing network services Large scale networks Distributed execution Ease of use : Algorithms should scale well : Difficult to debug & get it right : All Scientists not programmers Role of middleware Why self-stabilization Application Middleware Network Specifications: Control error Convergence time Scale: node systems Distributed computing services that bridge the gap Broadcast nature: Interference Collisions Resource-constrained Faults will happen Messages will be lost Nodes may fail Variables may be corrupted Nodes restarting in arbitrary state Low on battery arbitrary behavior! Mobility nodes move around Once a fault stops system should recover Return to a good state And stay in good states Common examples Discovering alternate routes Electing alternate aggregating nodes 4

5 Why self-stabilization Reasoning Faults will happen Messages will be lost Note Nodes may fail Variables may This be corrupted is different than masking fault-tolerance Nodes restarting in arbitrary state Low on battery In masking, arbitrary up behavior! to a certain number of faults can be Mobility nodes handled move without aroundfalling out of correct state Example: have redundant routes always Once a fault stops system should recover Return to a good More state expensive And stay in good states Common examples Discovering alternate routes Electing alternate aggregating nodes Fault model A set of faults that can lead to program moving out of invariant states Example, node fault, network faults Self-stabilization Show that irrespective of initial condition, protocol converges to invariant If no more faults, protocol stays within invariant Invariant states Faulty states Fixed point Reasoning about self-stabilization Invariant A state predicate that continues to hold when the actions are executed in any process in any order Proving safety Find an invariant that satisfies correctness and show that it holds for the protocol Fault-local self-stabilization A self-stabilizing program is fault-local self-stabilizing if the time and number of messages bounded by perturbation size Not dependent on the network size. Biological analogy Blood clots around a wound Fixed point A predicate that belongs to invariant and is a terminating condition [no more actions are enabled] Proving progress Show that program eventually terminates Sometimes shown using a variant function 5

6 Simple example: leader election Self-stabilizing leader election Given a set of moteswithin a 2 hop neighborhood, write a distributed program which ensures that a unique leader is appointed. The program should stabilize when nodes (including the leader are added or removed. Programs that use node ids are not recommended as they will re-elect a leader every time a new node is added. One hop network: All nodes within hearing range of each other Two hop network: Diameter = 2 At least one node common to range of any two nodes Leader election protocol [2 hops] Process j Variables: Status [either idle,candidate, leader, follower] Cluster_id [id of leader] Actions Timeout I(j.idle -> j.status = candidate; bcast [cand_msg(j] Timeout II((j.follower or j.leader ^ recv[cand_msg(i] -> bcast[conflict_msg(j, j.cluster_id] Timeout III(j.candidate -> j.status=leader; j.cluster_id=j; bcast[leader_msg(j] (J.idle or J.candidate ^ recv[conflic_msg(i, m] -> J.status=follower; j.cluster_id = m; (J.idle or J.candidate ^ recv[leader_msg(i] -> J.status = follower; j.cluster_id = i; Solution strategy Example 2: Solid-disc clustering Exploit Atomic broadcast property of wireless network Simultaneous reception if successful Randomization and CSMA 4 states at each node Idle, candidate, leader, follower Discussion Why clustering? Act as information aggregators Act as distributed controllers Solid-disc clustering: All nodes within a unit distance of the clusterhead belong only to that cluster All clusters have a non-overlapping unit radius solid-disc Why? Reduces intra-cluster signal contention clusterheadis shielded at all sides with members, does not have to endure over-hearing nodes from other clusters Yields better spatial coverage with clusters aggregation at clusterhead is more meaningful since it is median of the cluster Results in a guaranteed upper bound on the number of clusters 6

7 Requirements How to handle? Solid-disc property Self-stabilizing to addition and deletion of nodes Handles node mobility No global propagation of re-clustering : local stabilization Not dependent on network size Relax clustering requirements a unique node is designated as a leader of each cluster all nodes in the i-band of each leader belong to that cluster nodes within o-band (= m*i-band radius may belong to a cluster each node belongs to a cluster no node belongs to multiple clusters Form clusters in O(1 time Not dependent on network size No assumptions of a starting node or a starting configuration Program converges from any state Challenges for solid-disc clustering Justification for stretch factor > 2 ( ( ( ( ( ( ( new node subsumed cascading ( ( ( ( A B new node For m 2 local healing is achieved: a new node is either subsumed by one of the existing clusters, or allowed to form its own cluster without disturbing neighboring clusters Equi-radiussolid-disc clustering with bounded overlaps is not achievable in a distributed and local manner ( ( ( ( ( ( ( ( ( new cluster 7

8 Key idea Nodes wait for random time and announce candidacy Candidate nodes receive conflict notification if another clusterhead exists in solid-disc Else they lock all nodes in solid disc into their cluster 2. Consistency of distributed control Program shown to locally stabilize from arbitrary states Even if solid-disc cluster property is violated for any reason Stabilization in O(1 time Sample clustering with FLOC Problem 1: serializability Actuator: camera, heat source, light source etc. C1 C2 Controller Sensor: heat, light, camera. Sensor / actuator pairs distributed across an area Controllers C1 and C2 have overlapping set of actuators Can cause read/write conflicts during concurrent transactions Inconsistency example: C1 reads current state; C2 reads current state; C1 updates; C2 updates 8

9 Solution atomicity enforcement Challenges in wireless networks Distributed mutual exclusion [Ricart Agarwala] Controllers first acquire locks from other controllers with shared actuators Requests ordered by timestamps [Logical clocks] plus priorities for ties Proceed only if all locks received Ordering ensures progress, fairness and prevents deadlocks Issues Synchronization may not be possible Strict ordering will reduce allowable parallelization Arrows indicate shared resources 4 1,4,5 can go in parallel But 4,5 may be blocked by 2,3 respectively Well-known that message reliability is a requirement for consensus Recall attacking generals problem But wireless messages likely to be lost upon interferences and occasionally by link failures May assume eventual delivery of messages But what commit strategies are most efficient? 2 PC with timeout or 3 PC Time out commit or timeout abort Use explicit acks(interference prone or only negative acks Problem 2: Atomic commit Controller initiates updates to actuator set A Ensure atomicity All commit Or all revert [If committed, cannot revert] 2 phase commit Controller requests Actuators acknowledge If all acks received, commit confirmed - send commit message Issues: coordinator dies before issuing commit, Some receive commit message, others don t and controller fails 3 phase commit Include a prepare-to-commit phase with a timeout recovery If no commit received revert, else commit If any one node commits, it can force others to commit even if controller fails Here is one strategy C1 C2 C1 broadcasts requests, atomically heard by all neighbors Shared actuators detect and report conflict when C2 requests At least one controller makes progress Abort is time triggered at all nodes If controller receives no conflict, it sends commit to all nodes If no commit received, actuators send inquiry If even one actuator commits, can assist others in committing Optimistic protocol for simultaneously enforcing serializabilityand atomicity! 9

10 Research questions Acksor nacks Acks have interference issue With Nacks Is there really no conflict? Or was the message not delivered? Upon timeout commit or abort Commit saves a message But if commit was wrong, abort is impossible How to choose timeout? Impact in multi-hops network Prolonged proposal and rejection phase? Conclusions Sensor actuator networks Large scale sensing combined with actuation Building blocks for cyber physical systems Can form ubiquitous intelligent systems Key distributed computing principles for sensor actuator networks Self-stabilizing, robust network programs Local self-healing Conflict serializability Distributed mutual exclusion Atomicity enforcement Exploit locality in application design Joint design of control and network Control system designed with network capabilities in mind Not greedy demands 10

Lecture 8 Wireless Sensor Networks: Overview

Lecture 8 Wireless Sensor Networks: Overview Lecture 8 Wireless Sensor Networks: Overview Reading: Wireless Sensor Networks, in Ad Hoc Wireless Networks: Architectures and Protocols, Chapter 12, sections 12.1-12.2. I. Akyildiz, W. Su, Y. Sankarasubramaniam

More information

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 1 Introduction Modified by: Dr. Ramzi Saifan Definition of a Distributed System (1) A distributed

More information

Smart Dust : Dispersed, Un-tethered Geospatial Monitoring. Dr. Raja R. Kadiyala Chief Technology Officer CH2M HILL - Oakland, CA

Smart Dust : Dispersed, Un-tethered Geospatial Monitoring. Dr. Raja R. Kadiyala Chief Technology Officer CH2M HILL - Oakland, CA Smart Dust : Dispersed, Un-tethered Geospatial Monitoring Dr. Raja R. Kadiyala Chief Technology Officer CH2M HILL - Oakland, CA raja@ch2m.com Drivers and Trends Sensing, Communication and Computation MEMS

More information

Wireless Sensor Networks (WSN)

Wireless Sensor Networks (WSN) Wireless Sensor Networks (WSN) Introduction M. Schölzel Difference to existing wireless networks Infrastructure-based networks e.g., GSM, UMTS, Base stations connected to a wired backbone network Mobile

More information

Outline. CS5984 Mobile Computing. Dr. Ayman Abdel-Hamid, CS5984. Wireless Sensor Networks 1/2. Wireless Sensor Networks 2/2

Outline. CS5984 Mobile Computing. Dr. Ayman Abdel-Hamid, CS5984. Wireless Sensor Networks 1/2. Wireless Sensor Networks 2/2 CS5984 Mobile Computing Outline : a Survey Dr. Ayman Abdel-Hamid Computer Science Department Virginia Tech An Introduction to 1 2 1/2 Advances in micro-electro-mechanical systems technology, wireless communications,

More information

Introduction to Distributed Systems Seif Haridi

Introduction to Distributed Systems Seif Haridi Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send

More information

Distributed Computing. CS439: Principles of Computer Systems November 20, 2017

Distributed Computing. CS439: Principles of Computer Systems November 20, 2017 Distributed Computing CS439: Principles of Computer Systems November 20, 2017 Last Time Network Programming: Sockets End point of communication Identified by (IP address : port number) pair Client-Side

More information

Distributed Systems. 12. Concurrency Control. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. 12. Concurrency Control. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems 12. Concurrency Control Paul Krzyzanowski Rutgers University Fall 2017 2014-2017 Paul Krzyzanowski 1 Why do we lock access to data? Locking (leasing) provides mutual exclusion Only

More information

Exam 2 Review. Fall 2011

Exam 2 Review. Fall 2011 Exam 2 Review Fall 2011 Question 1 What is a drawback of the token ring election algorithm? Bad question! Token ring mutex vs. Ring election! Ring election: multiple concurrent elections message size grows

More information

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q Coordination 1 To do q q q Mutual exclusion Election algorithms Next time: Global state Coordination and agreement in US Congress 1798-2015 Process coordination How can processes coordinate their action?

More information

Consensus and related problems

Consensus and related problems Consensus and related problems Today l Consensus l Google s Chubby l Paxos for Chubby Consensus and failures How to make process agree on a value after one or more have proposed what the value should be?

More information

Distributed Computing. CS439: Principles of Computer Systems November 19, 2018

Distributed Computing. CS439: Principles of Computer Systems November 19, 2018 Distributed Computing CS439: Principles of Computer Systems November 19, 2018 Bringing It All Together We ve been studying how an OS manages a single CPU system As part of that, it will communicate with

More information

Today: Fault Tolerance. Fault Tolerance

Today: Fault Tolerance. Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

Integrity in Distributed Databases

Integrity in Distributed Databases Integrity in Distributed Databases Andreas Farella Free University of Bozen-Bolzano Table of Contents 1 Introduction................................................... 3 2 Different aspects of integrity.....................................

More information

Wireless Sensor Networks --- Concepts and Challenges

Wireless Sensor Networks --- Concepts and Challenges Outline Wireless Sensor Networks --- Concepts and Challenges Basic Concepts Applications Characteristics and Challenges 2 Traditional Sensing Method Basic Concepts Signal analysis Wired/Wireless Object

More information

Wireless Sensor Networks --- Concepts and Challenges

Wireless Sensor Networks --- Concepts and Challenges Wireless Sensor Networks --- Concepts and Challenges Outline Basic Concepts Applications Characteristics and Challenges 2 1 Basic Concepts Traditional Sensing Method Wired/Wireless Object Signal analysis

More information

Mobile Ad Hoc Networks: Basic Concepts and Research Issues

Mobile Ad Hoc Networks: Basic Concepts and Research Issues Mobile Ad Hoc s: Basic Concepts and Research Issues Ing. Alessandro Leonardi aleonardi@dieei.unict.it Wireless s Generations (1/3) Generation 1G 2G 2.5G 3G 4/5G Time 1980s 1990s Late1990s 2000s (2010 full

More information

Today: Fault Tolerance

Today: Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5. Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message

More information

Introduction to Distributed Systems. INF5040/9040 Autumn 2018 Lecturer: Eli Gjørven (ifi/uio)

Introduction to Distributed Systems. INF5040/9040 Autumn 2018 Lecturer: Eli Gjørven (ifi/uio) Introduction to Distributed Systems INF5040/9040 Autumn 2018 Lecturer: Eli Gjørven (ifi/uio) August 28, 2018 Outline Definition of a distributed system Goals of a distributed system Implications of distributed

More information

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 14: Data Replication Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database Replication What is database replication The advantages of

More information

Exam 2 Review. October 29, Paul Krzyzanowski 1

Exam 2 Review. October 29, Paul Krzyzanowski 1 Exam 2 Review October 29, 2015 2013 Paul Krzyzanowski 1 Question 1 Why did Dropbox add notification servers to their architecture? To avoid the overhead of clients polling the servers periodically to check

More information

An Introduction to Cyber-Physical Systems INF5910/INF9910

An Introduction to Cyber-Physical Systems INF5910/INF9910 An Introduction to Cyber-Physical Systems INF5910/INF9910 1 Outline What is Cyber Physical Systems (CPS)? Applications Challenges Cyber Physical CPS 2 Cyber Systems Cyber is More than just software More

More information

Silberschatz and Galvin Chapter 18

Silberschatz and Galvin Chapter 18 Silberschatz and Galvin Chapter 18 Distributed Coordination CPSC 410--Richard Furuta 4/21/99 1 Distributed Coordination Synchronization in a distributed environment Ð Event ordering Ð Mutual exclusion

More information

CSE 5306 Distributed Systems. Course Introduction

CSE 5306 Distributed Systems. Course Introduction CSE 5306 Distributed Systems Course Introduction 1 Instructor and TA Dr. Donggang Liu @ CSE Web: http://ranger.uta.edu/~dliu Email: dliu@uta.edu Phone: 817-2720741 Office: ERB 555 Office hours: Tus/Ths

More information

Topics in Reliable Distributed Systems

Topics in Reliable Distributed Systems Topics in Reliable Distributed Systems 049017 1 T R A N S A C T I O N S Y S T E M S What is A Database? Organized collection of data typically persistent organization models: relational, object-based,

More information

Thomas Moscibroda Roger Wattenhofer MASS Efficient Computation of Maximal Independent Sets in Unstructured Multi-Hop Radio Networks

Thomas Moscibroda Roger Wattenhofer MASS Efficient Computation of Maximal Independent Sets in Unstructured Multi-Hop Radio Networks Efficient Computation of Maximal Independent Sets in Unstructured Multi-Hop Radio Networks Thomas Moscibroda Roger Wattenhofer Distributed Computing Group MASS 2004 Algorithms for Ad Hoc and Sensor Networks...

More information

Chapter 22. Transaction Management

Chapter 22. Transaction Management Chapter 22 Transaction Management 1 Transaction Support Transaction Action, or series of actions, carried out by user or application, which reads or updates contents of database. Logical unit of work on

More information

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju Chapter 4: Distributed Systems: Replication and Consistency Fall 2013 Jussi Kangasharju Chapter Outline n Replication n Consistency models n Distribution protocols n Consistency protocols 2 Data Replication

More information

Recovering from a Crash. Three-Phase Commit

Recovering from a Crash. Three-Phase Commit Recovering from a Crash If INIT : abort locally and inform coordinator If Ready, contact another process Q and examine Q s state Lecture 18, page 23 Three-Phase Commit Two phase commit: problem if coordinator

More information

Adaptive Middleware for Distributed Sensor Environments

Adaptive Middleware for Distributed Sensor Environments Adaptive Middleware for Distributed Sensor Environments Xingbo Yu, Koushik Niyogi, Sharad Mehrotra, Nalini Venkatasubramanian University of California, Irvine {xyu, kniyogi, sharad, nalini}@ics.uci.edu

More information

15-441: Computer Networking. Lecture 24: Ad-Hoc Wireless Networks

15-441: Computer Networking. Lecture 24: Ad-Hoc Wireless Networks 15-441: Computer Networking Lecture 24: Ad-Hoc Wireless Networks Scenarios and Roadmap Point to point wireless networks (last lecture) Example: your laptop to CMU wireless Challenges: Poor and variable

More information

Several of these problems are motivated by trying to use solutiions used in `centralized computing to distributed computing

Several of these problems are motivated by trying to use solutiions used in `centralized computing to distributed computing Studying Different Problems from Distributed Computing Several of these problems are motivated by trying to use solutiions used in `centralized computing to distributed computing Problem statement: Mutual

More information

Wireless Sensor Networks: From Science to Reality. Kay Römer ETH Zurich

Wireless Sensor Networks: From Science to Reality. Kay Römer ETH Zurich Wireless Sensor Networks: From Science to Reality Kay Römer ETH Zurich Sensor Networks Ad hoc network of sensor nodes Perceive (sensors) Process (microcontroller) Communicate (radio) Autonomous power supply

More information

Concurrency Control II and Distributed Transactions

Concurrency Control II and Distributed Transactions Concurrency Control II and Distributed Transactions CS 240: Computing Systems and Concurrency Lecture 18 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material.

More information

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need

More information

Transactions. CS 475, Spring 2018 Concurrent & Distributed Systems

Transactions. CS 475, Spring 2018 Concurrent & Distributed Systems Transactions CS 475, Spring 2018 Concurrent & Distributed Systems Review: Transactions boolean transfermoney(person from, Person to, float amount){ if(from.balance >= amount) { from.balance = from.balance

More information

Distributed Systems Fault Tolerance

Distributed Systems Fault Tolerance Distributed Systems Fault Tolerance [] Fault Tolerance. Basic concepts - terminology. Process resilience groups and failure masking 3. Reliable communication reliable client-server communication reliable

More information

Distributed Systems Principles and Paradigms

Distributed Systems Principles and Paradigms Distributed Systems Principles and Paradigms Chapter 01 (version September 5, 2007) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.20.

More information

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 1 Introduction Definition of a Distributed System (1) A distributed system is: A collection of

More information

Distributed Algorithms 6.046J, Spring, 2015 Part 2. Nancy Lynch

Distributed Algorithms 6.046J, Spring, 2015 Part 2. Nancy Lynch Distributed Algorithms 6.046J, Spring, 2015 Part 2 Nancy Lynch 1 This Week Synchronous distributed algorithms: Leader Election Maximal Independent Set Breadth-First Spanning Trees Shortest Paths Trees

More information

Synchronization. Clock Synchronization

Synchronization. Clock Synchronization Synchronization Clock Synchronization Logical clocks Global state Election algorithms Mutual exclusion Distributed transactions 1 Clock Synchronization Time is counted based on tick Time judged by query

More information

Mobility Control for Complete Coverage in Wireless Sensor Networks

Mobility Control for Complete Coverage in Wireless Sensor Networks Mobility Control for Complete Coverage in Wireless Sensor Networks Zhen Jiang Computer Sci. Dept. West Chester University West Chester, PA 9383, USA zjiang@wcupa.edu Jie Wu Computer Sci. & Eng. Dept. Florida

More information

Concurrency Control. Transaction Management. Lost Update Problem. Need for Concurrency Control. Concurrency control

Concurrency Control. Transaction Management. Lost Update Problem. Need for Concurrency Control. Concurrency control Concurrency Control Process of managing simultaneous operations on the database without having them interfere with one another. Transaction Management Concurrency control Connolly & Begg. Chapter 19. Third

More information

WSN Routing Protocols

WSN Routing Protocols WSN Routing Protocols 1 Routing Challenges and Design Issues in WSNs 2 Overview The design of routing protocols in WSNs is influenced by many challenging factors. These factors must be overcome before

More information

Introduction to Mobile Ad hoc Networks (MANETs)

Introduction to Mobile Ad hoc Networks (MANETs) Introduction to Mobile Ad hoc Networks (MANETs) 1 Overview of Ad hoc Network Communication between various devices makes it possible to provide unique and innovative services. Although this inter-device

More information

Synchronization Part 2. REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17

Synchronization Part 2. REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17 Synchronization Part 2 REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17 1 Outline Part 2! Clock Synchronization! Clock Synchronization Algorithms!

More information

Today: Fault Tolerance. Replica Management

Today: Fault Tolerance. Replica Management Today: Fault Tolerance Failure models Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Failure recovery

More information

Concurrency control CS 417. Distributed Systems CS 417

Concurrency control CS 417. Distributed Systems CS 417 Concurrency control CS 417 Distributed Systems CS 417 1 Schedules Transactions must have scheduled so that data is serially equivalent Use mutual exclusion to ensure that only one transaction executes

More information

Distributed Algorithms. Partha Sarathi Mandal Department of Mathematics IIT Guwahati

Distributed Algorithms. Partha Sarathi Mandal Department of Mathematics IIT Guwahati Distributed Algorithms Partha Sarathi Mandal Department of Mathematics IIT Guwahati Thanks to Dr. Sukumar Ghosh for the slides Distributed Algorithms Distributed algorithms for various graph theoretic

More information

CS551 Ad-hoc Routing

CS551 Ad-hoc Routing CS551 Ad-hoc Routing Bill Cheng http://merlot.usc.edu/cs551-f12 1 Mobile Routing Alternatives Why not just assume a base station? good for many cases, but not some (military, disaster recovery, sensor

More information

Introduction. Distributed Systems IT332

Introduction. Distributed Systems IT332 Introduction Distributed Systems IT332 2 Outline Definition of A Distributed System Goals of Distributed Systems Types of Distributed Systems 3 Definition of A Distributed System A distributed systems

More information

Part I. Wireless Communication

Part I. Wireless Communication 1 Part I. Wireless Communication 1.5 Topologies of cellular and ad-hoc networks 2 Introduction Cellular telephony has forever changed the way people communicate with one another. Cellular networks enable

More information

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer? Parallel and Distributed Systems Instructor: Sandhya Dwarkadas Department of Computer Science University of Rochester What is a parallel computer? A collection of processing elements that communicate and

More information

transaction - (another def) - the execution of a program that accesses or changes the contents of the database

transaction - (another def) - the execution of a program that accesses or changes the contents of the database Chapter 19-21 - Transaction Processing Concepts transaction - logical unit of database processing - becomes interesting only with multiprogramming - multiuser database - more than one transaction executing

More information

Recall our 2PC commit problem. Recall our 2PC commit problem. Doing failover correctly isn t easy. Consensus I. FLP Impossibility, Paxos

Recall our 2PC commit problem. Recall our 2PC commit problem. Doing failover correctly isn t easy. Consensus I. FLP Impossibility, Paxos Consensus I Recall our 2PC commit problem FLP Impossibility, Paxos Client C 1 C à TC: go! COS 418: Distributed Systems Lecture 7 Michael Freedman Bank A B 2 TC à A, B: prepare! 3 A, B à P: yes or no 4

More information

Mobile Wireless Sensor Network enables convergence of ubiquitous sensor services

Mobile Wireless Sensor Network enables convergence of ubiquitous sensor services 1 2005 Nokia V1-Filename.ppt / yyyy-mm-dd / Initials Mobile Wireless Sensor Network enables convergence of ubiquitous sensor services Dr. Jian Ma, Principal Scientist Nokia Research Center, Beijing 2 2005

More information

DB2 Lecture 10 Concurrency Control

DB2 Lecture 10 Concurrency Control DB2 Lecture 10 Control Jacob Aae Mikkelsen November 28, 2012 1 / 71 Jacob Aae Mikkelsen DB2 Lecture 10 Control ACID Properties Properly implemented transactions are commonly said to meet the ACID test,

More information

CS 425 / ECE 428 Distributed Systems Fall 2014

CS 425 / ECE 428 Distributed Systems Fall 2014 CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta Sensor Networks Lecture 24 A Reading: Links on website All Slides IG 1 Some questions What is the smallest transistor out there today? 2 Some

More information

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson Distributed systems Lecture 6: Elections, distributed transactions, and replication DrRobert N. M. Watson 1 Last time Saw how we can build ordered multicast Messages between processes in a group Need to

More information

Replication and Consistency. Fall 2010 Jussi Kangasharju

Replication and Consistency. Fall 2010 Jussi Kangasharju Replication and Consistency Fall 2010 Jussi Kangasharju Chapter Outline Replication Consistency models Distribution protocols Consistency protocols 2 Data Replication user B user C user A object object

More information

Distributed Systems 11. Consensus. Paul Krzyzanowski

Distributed Systems 11. Consensus. Paul Krzyzanowski Distributed Systems 11. Consensus Paul Krzyzanowski pxk@cs.rutgers.edu 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value must be one

More information

16 Time Triggered Protocol

16 Time Triggered Protocol 16 Time Triggered Protocol [TTtech04] (TTP) 18-549 Distributed Embedded Systems Philip Koopman October 25, 2004 Significant material drawn from: Prof. H. Kopetz [Kopetz] TTP Specification v 1.1 [TTTech]

More information

Mobile Sink to Track Multiple Targets in Wireless Visual Sensor Networks

Mobile Sink to Track Multiple Targets in Wireless Visual Sensor Networks Mobile Sink to Track Multiple Targets in Wireless Visual Sensor Networks William Shaw 1, Yifeng He 1, and Ivan Lee 1,2 1 Department of Electrical and Computer Engineering, Ryerson University, Toronto,

More information

Initial Assumptions. Modern Distributed Computing. Network Topology. Initial Input

Initial Assumptions. Modern Distributed Computing. Network Topology. Initial Input Initial Assumptions Modern Distributed Computing Theory and Applications Ioannis Chatzigiannakis Sapienza University of Rome Lecture 4 Tuesday, March 6, 03 Exercises correspond to problems studied during

More information

Core Embedded Systems

Core Embedded Systems Grand Challenges in Cyber Physical The Next Generation Tarek Abdelzaher Department of Computer Science University of Illinois at Urbana Champaign Core Core Challenges: Dependability Hardware/Software Co-Design

More information

Frequently asked questions from the previous class survey

Frequently asked questions from the previous class survey CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [DISTRIBUTED COORDINATION/MUTUAL EXCLUSION] Shrideep Pallickara Computer Science Colorado State University L22.1 Frequently asked questions from the previous

More information

Wireless and Mobile Networks Reading: Sections 2.8 and 4.2.5

Wireless and Mobile Networks Reading: Sections 2.8 and 4.2.5 Wireless and Mobile Networks Reading: Sections 2.8 and 4.2.5 Acknowledgments: Lecture slides are from Computer networks course thought by Jennifer Rexford at Princeton University. When slides are obtained

More information

Synchronization Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

Synchronization Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University Synchronization Part II CS403/534 Distributed Systems Erkay Savas Sabanci University 1 Election Algorithms Issue: Many distributed algorithms require that one process act as a coordinator (initiator, etc).

More information

CSE 486/586 Distributed Systems

CSE 486/586 Distributed Systems CSE 486/586 Distributed Systems Mutual Exclusion Steve Ko Computer Sciences and Engineering University at Buffalo CSE 486/586 Recap: Consensus On a synchronous system There s an algorithm that works. On

More information

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University Frequently asked questions from the previous class survey CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [DISTRIBUTED COORDINATION/MUTUAL EXCLUSION] Shrideep Pallickara Computer Science Colorado State University

More information

Concurrency Control in Distributed Systems. ECE 677 University of Arizona

Concurrency Control in Distributed Systems. ECE 677 University of Arizona Concurrency Control in Distributed Systems ECE 677 University of Arizona Agenda What? Why? Main problems Techniques Two-phase locking Time stamping method Optimistic Concurrency Control 2 Why concurrency

More information

MENCIUS: BUILDING EFFICIENT

MENCIUS: BUILDING EFFICIENT MENCIUS: BUILDING EFFICIENT STATE MACHINE FOR WANS By: Yanhua Mao Flavio P. Junqueira Keith Marzullo Fabian Fuxa, Chun-Yu Hsiung November 14, 2018 AGENDA 1. Motivation 2. Breakthrough 3. Rules of Mencius

More information

TRANSACTION PROPERTIES

TRANSACTION PROPERTIES Transaction Is any action that reads from and/or writes to a database. A transaction may consist of a simple SELECT statement to generate a list of table contents; it may consist of series of INSERT statements

More information

Ad Hoc Networks: Introduction

Ad Hoc Networks: Introduction Ad Hoc Networks: Introduction Module A.int.1 Dr.M.Y.Wu@CSE Shanghai Jiaotong University Shanghai, China Dr.W.Shu@ECE University of New Mexico Albuquerque, NM, USA 1 Ad Hoc networks: introduction A.int.1-2

More information

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski Distributed Systems 09. State Machine Replication & Virtual Synchrony Paul Krzyzanowski Rutgers University Fall 2016 1 State machine replication 2 State machine replication We want high scalability and

More information

Introduction to Distributed Systems

Introduction to Distributed Systems Introduction to Distributed Systems Other matters: review of the Bakery Algorithm: why can t we simply keep track of the last ticket taken and the next ticvket to be called? Ref: [Coulouris&al Ch 1, 2]

More information

CHAPTER 2 WIRELESS SENSOR NETWORKS AND NEED OF TOPOLOGY CONTROL

CHAPTER 2 WIRELESS SENSOR NETWORKS AND NEED OF TOPOLOGY CONTROL WIRELESS SENSOR NETWORKS AND NEED OF TOPOLOGY CONTROL 2.1 Topology Control in Wireless Sensor Networks Network topology control is about management of network topology to support network-wide requirement.

More information

A Byzantine Fault-Tolerant Key-Value Store for Safety-Critical Distributed Real-Time Systems

A Byzantine Fault-Tolerant Key-Value Store for Safety-Critical Distributed Real-Time Systems Work in progress A Byzantine Fault-Tolerant Key-Value Store for Safety-Critical Distributed Real-Time Systems December 5, 2017 CERTS 2017 Malte Appel, Arpan Gujarati and Björn B. Brandenburg Distributed

More information

Distributed Systems. Chapter 1: Introduction

Distributed Systems. Chapter 1: Introduction Distributed Systems (3rd Edition) Chapter 1: Introduction Version: February 25, 2017 2/56 Introduction: What is a distributed system? Distributed System Definition A distributed system is a collection

More information

Goal of Concurrency Control. Concurrency Control. Example. Solution 1. Solution 2. Solution 3

Goal of Concurrency Control. Concurrency Control. Example. Solution 1. Solution 2. Solution 3 Goal of Concurrency Control Concurrency Control Transactions should be executed so that it is as though they executed in some serial order Also called Isolation or Serializability Weaker variants also

More information

Basic concepts in fault tolerance Masking failure by redundancy Process resilience Reliable communication. Distributed commit.

Basic concepts in fault tolerance Masking failure by redundancy Process resilience Reliable communication. Distributed commit. Basic concepts in fault tolerance Masking failure by redundancy Process resilience Reliable communication One-one communication One-many communication Distributed commit Two phase commit Failure recovery

More information

Distributed Systems COMP 212. Revision 2 Othon Michail

Distributed Systems COMP 212. Revision 2 Othon Michail Distributed Systems COMP 212 Revision 2 Othon Michail Synchronisation 2/55 How would Lamport s algorithm synchronise the clocks in the following scenario? 3/55 How would Lamport s algorithm synchronise

More information

Distributed Systems Principles and Paradigms. Chapter 01: Introduction

Distributed Systems Principles and Paradigms. Chapter 01: Introduction Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 01: Introduction Version: October 25, 2009 2 / 26 Contents Chapter

More information

GIAN Course on Distributed Network Algorithms. Spanning Tree Constructions

GIAN Course on Distributed Network Algorithms. Spanning Tree Constructions GIAN Course on Distributed Network Algorithms Spanning Tree Constructions Stefan Schmid @ T-Labs, 2011 Spanning Trees Attactive infrastructure : sparse subgraph ( loop-free backbone ) connecting all nodes.

More information

Reminder: Datalink Functions Computer Networking. Datalink Architectures

Reminder: Datalink Functions Computer Networking. Datalink Architectures Reminder: Datalink Functions 15-441 15 441 15-641 Computer Networking Lecture 5 Media Access Control Peter Steenkiste Fall 2015 www.cs.cmu.edu/~prs/15-441-f15 Framing: encapsulating a network layer datagram

More information

Distributed Systems Principles and Paradigms. Chapter 01: Introduction. Contents. Distributed System: Definition.

Distributed Systems Principles and Paradigms. Chapter 01: Introduction. Contents. Distributed System: Definition. Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 01: Version: February 21, 2011 1 / 26 Contents Chapter 01: 02: Architectures

More information

Distributed Systems Conclusions & Exam. Brian Nielsen

Distributed Systems Conclusions & Exam. Brian Nielsen Distributed Systems Conclusions & Exam Brian Nielsen bnielsen@cs.aau.dk Definition A distributed system is the one in which hardware and software components at networked computers communicate and coordinate

More information

T ransaction Management 4/23/2018 1

T ransaction Management 4/23/2018 1 T ransaction Management 4/23/2018 1 Air-line Reservation 10 available seats vs 15 travel agents. How do you design a robust and fair reservation system? Do not enough resources Fair policy to every body

More information

GIAN Course on Distributed Network Algorithms. Spanning Tree Constructions

GIAN Course on Distributed Network Algorithms. Spanning Tree Constructions GIAN Course on Distributed Network Algorithms Spanning Tree Constructions Stefan Schmid @ T-Labs, 2011 Spanning Trees Attactive infrastructure : sparse subgraph ( loop-free backbone ) connecting all nodes.

More information

CSE 486/586: Distributed Systems

CSE 486/586: Distributed Systems CSE 486/586: Distributed Systems Concurrency Control (part 3) Ethan Blanton Department of Computer Science and Engineering University at Buffalo Lost Update Some transaction T1 runs interleaved with some

More information

02 - Distributed Systems

02 - Distributed Systems 02 - Distributed Systems Definition Coulouris 1 (Dis)advantages Coulouris 2 Challenges Saltzer_84.pdf Models Physical Architectural Fundamental 2/58 Definition Distributed Systems Distributed System is

More information

Synchronization in Sensor Networks

Synchronization in Sensor Networks Synchronization in Sensor Networks Blerta Bishaj Helsinki University of Technology 1. Introduction... 2 2. Characterizing Time Synchronization... 2 3. Causes of clock desynchronization... 3 4. Algorithms...

More information

Mobile Communications Chapter 8: Routing Protocols

Mobile Communications Chapter 8: Routing Protocols Mobile Communications Chapter 8: Routing Protocols Ad-hoc networks Routing protocols Prof. Dr.-Ing. Jochen Schiller, http://www.jochenschiller.de/ MC SS05 8.1 Mobile ad hoc networks Standard Mobile IP

More information

Distributed Algorithms 6.046J, Spring, Nancy Lynch

Distributed Algorithms 6.046J, Spring, Nancy Lynch Distributed Algorithms 6.046J, Spring, 205 Nancy Lynch What are Distributed Algorithms? Algorithms that run on networked processors, or on multiprocessors that share memory. They solve many kinds of problems:

More information

European Network on New Sensing Technologies for Air Pollution Control and Environmental Sustainability - EuNetAir COST Action TD1105

European Network on New Sensing Technologies for Air Pollution Control and Environmental Sustainability - EuNetAir COST Action TD1105 European Network on New Sensing Technologies for Air Pollution Control and Environmental Sustainability - EuNetAir COST Action TD1105 A Holistic Approach in the Development and Deployment of WSN-based

More information

Concurrent & Distributed 7Systems Safety & Liveness. Uwe R. Zimmer - The Australian National University

Concurrent & Distributed 7Systems Safety & Liveness. Uwe R. Zimmer - The Australian National University Concurrent & Distributed 7Systems 2017 Safety & Liveness Uwe R. Zimmer - The Australian National University References for this chapter [ Ben2006 ] Ben-Ari, M Principles of Concurrent and Distributed Programming

More information

6.033: Fault Tolerance: Isolation Lecture 17 Katrina LaCurts,

6.033: Fault Tolerance: Isolation Lecture 17 Katrina LaCurts, 6.033: Fault Tolerance: Isolation Lecture 17 Katrina LaCurts, lacurts@mit.edu 0. Introduction - Last time: Atomicity via logging. We're good with atomicity now. - Today: Isolation - The problem: We have

More information

Database Management Systems Concurrency Control

Database Management Systems Concurrency Control atabase Management Systems Concurrency Control B M G 1 BMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHOS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files ata Files

More information

Distributed Systems (5DV147)

Distributed Systems (5DV147) Distributed Systems (5DV147) Replication and consistency Fall 2013 1 Replication 2 What is replication? Introduction Make different copies of data ensuring that all copies are identical Immutable data

More information