DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 1 Introduction Modified by: Dr. Ramzi Saifan

Definition of a Distributed System (1) A distributed system is a collection of autonomous computing elements that appears to its users as a single coherent system. Autonomous computing elements, also referred to as nodes, be the hardware devices or software processes. Single coherent system: users or applications perceive a single system nodes need to collaborate.

Collection of autonomous nodes Independent behavior Each node is autonomous and will thus have its own notion of time: there is no global clock. Leads to fundamental synchronization and coordination problems. Collection of nodes How to manage group membership? How to know that you are indeed communicating with an authorized(non)member?

Overlay network Organization Each node in the collection communicates only with other nodes in the system, its neighbors. The set of neighbors may be dynamic, or may even be known only implicitly (i.e., requires a lookup). Overlay types: Well-known example of overlay networks: peer-to-peer systems. Structured: each node has a well-defined set of neighbors with whom it can communicate (tree, ring). Unstructured: each node has references to randomly selected other nodes from the system.

Essence Coherent system The collection of nodes as a whole operates the same, no matter where, when, and how interaction between a user and the system takes place. Examples An end user cannot tell where a computation is taking place Where data is exactly stored should be irrelevant to an application If or not data has been replicated is completely hidden Keyword is distribution transparency The snag: partial failures It is inevitable that at any time only a part of the distributed system fails. Hiding partial failures and their recovery is often very difficult and in general impossible to hide.. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 3e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

Middleware: OS of Distributed Systems The middleware layer extends over multiple machines, and offers each application the same interface. What does it contain? Commonly used components and functions that need not be implemented by applications separately.

What do we want to achieve? 1. Sharing resources 2. Transparency 3. Openness 4. Scalability

Transparency in a Distributed System Figure 1-2. Different forms of transparency in a distributed system (ISO, 1995).

Degree of transparency Aiming at full distribution transparency may be too much: There are communication latencies that cannot be hidden Completely hiding failures of networks and nodes is (theoretically and practically) impossible You cannot distinguish a slow computer from a failing one You can never be sure that a server actually performed an operation before a crash Full transparency will cost performance, exposing distribution of the system Keeping replicas exactly up-to-date with the master takes time Immediately flushing write operations to disk for fault tolerance Conclusion: distribution transparency is a nice a goal, but achieving it is a different story, and it should often not even be aimed at.

Openness of distributed systems Open distributed system: Be able to interact with services from other open systems, irrespective of the underlying environment: osystems should conform to well-defined interfaces osystems should support portability of applications osystems should easily interoperate osystems should be extensible Achieving openness: At least make the distributed system independent from heterogeneity of the underlying environment: ohardware oplatforms olanguages

Degrees of scalability 1. Size 2. Geographic 3. Administrative

Scalability Problems Figure 1-3. Examples of scalability limitations.

Characteristics of decentralized algorithms: No machine has complete information about the system state. Machines make decisions based only on local information. Failure of one machine does not ruin the algorithm. There is no implicit assumption that a global clock exists.

Problems with geographical scalability Cannot simply go from LAN to WAN: many distributed systems assume synchronous client-server interactions: client sends request and waits for an answer. Latency may easily prohibit this scheme. WAN links are often inherently unreliable: simply moving streaming video from LAN to WAN is bound to fail. Lack of multipoint communication, so that a simple search broadcast cannot be deployed. Solution is to develop separate naming and directory services (having their own scalability problems).

Problems with administrative scalability Conflicting policies concerning usage (and thus payment), management, and security. We do not trust others. When multiple domains, security is an issue. How to trust the other system. How they manage my data and who gets access to it. Example when multiple cloud organizations are interacting to introduce multiple services to the client. Another example, sensor networks which collect data and send it to a central point.

Scaling techniques Hide communication latencies Make use of asynchronous communication (interrupt) Have separate handler for incoming response (multithreading) Problem: not every application fits this model (interactive) Partition data and computations across multiple machines Move computations to clients (Java applets) Decentralized naming services (DNS) Decentralized information systems (WWW) Replication and caching: Make copies of data available at different machines Replicated file servers and databases Web caches (in browsers and proxies) File caching (at server and client)

Scaling Techniques (1) Figure 1-4. The difference between letting (a) a server or (b) a client check forms as they are being filled.

Scaling Techniques (2) Figure 1-5. An example of dividing the DNS name space into zones.

Scaling Techniques (3) Replication: Replicated file servers and databases, Mirrored Web sites o o o o Performance enhancement Fault tolerance Load balancing Geographic scalability Cache is a replication example. Problems of replication: consistency, cost tradeoff. Strong consistency: needs global synchronization. The global synch may hinder replication for large systems.

Pitfalls when Developing Distributed Systems False assumptions made by first time developer: The network is reliable. The network is secure. The network is homogeneous. The topology does not change. Latency is zero. Bandwidth is infinite. Transport cost is zero. There is one administrator.

Types of Distributed Systems 1. High performance distributed computing systems a) Cluster systems b) Grid systems c) Cloud Computing 2. Distributed information systems a) Transactions processing b) Integrated enterprise applications 3. Distributed pervasive systems a) Electronic health care systems b) Sensor networks

Parallel computing High-performance distributed computing started with parallel computing. Multi processor: shared memory Multi computer Distributed shared memory

Cluster Computing Systems Figure 1-6. An example of a cluster computing system.

Grid Computing Systems Figure 1-7. A layered architecture for grid computing systems.

Cloud Computing

Is cloud computing cost-effective? Outsource the entire infrastructure. hardware and software. More than just providing high performance computing. Is outsourcing also cheaper?

Is cloud computing cost-effective? There is objective function that must contain costs and benefits. Also, a set of constraints.

Transaction Processing Systems (1) Figure 1-8. Example primitives for transactions.

Transaction Processing Systems (2) Characteristic properties of transactions (ACID): Atomic: To the outside world, the transaction happens indivisibly. Consistent: The transaction does not violate system invariants. Isolated: Concurrent transactions do not interfere with each other. Durable: Once a transaction commits, the changes are permanent.

Transaction Processing Systems (3) Figure 1-9. A nested transaction.

Transaction Processing Systems (4) Figure 1-10. The role of a TP monitor in distributed systems.

Enterprise Application Integration Middleware offers communication facilities for integration Remote Procedure Call (RPC): Requests are sent through local procedure call, packaged as message, processed, responded through message, and result returned as return from call. Message Oriented Middleware (MOM): Messages are sent to logical contact point (published), and forwarded to subscribed applications.

How to integrate applications File transfer: Technically simple, but not flexible: Figure out file format and layout Figure out file management Update propagation, and update notifications. Shared database: Much more flexible, but still requires common data scheme next to risk of bottleneck. Remote procedure call: Effective when execution of a series of actions is needed. Messaging: RPCs require caller and callee to be up and running at the same time. Messaging allows decoupling in time and space.

Distributed Pervasive Systems Requirements for pervasive systems Embrace contextual changes: changes in the networks is always expected. Encourage ad hoc composition. Recognize sharing as the default. Three (overlapping) subtypes Ubiquitous computing systems: pervasive and continuously present, i.e., there is a continuous interaction between system and user. Mobile computing systems: pervasive, but emphasis is on the fact that devices are inherently mobile. Sensor (and actuator) networks: pervasive, with emphasis on the actual (collaborative) sensing and actuation of the environment.

Sensor Networks (1) Questions concerning sensor networks: How do we (dynamically) set up an efficient tree in a sensor network? How does aggregation of results take place? Can it be controlled? What happens when network links fail?

Sensor Networks (2) Figure 1-13. Organizing a sensor network database, while storing and processing data (a) only at the operator s site or

Sensor Networks (3) Figure 1-13. Organizing a sensor network database, while storing and processing data or (b) only at the sensors.

Duty-cycled networks Many sensor networks need to operate on a strict energy budget: introduce duty cycles A node is active during T active time units, and then suspended for T suspended units, to become active again. Duty cycle = T active /(T active +T suspended ) Typical duty cycles are 10 30%, but can also be lower than 1%

Keeping duty-cycled networks in sync If duty cycles are low, sensor nodes may not wake up at the same time anymore and become permanently disconnected: they are active during different, non-overlapping time slots. Solution Each node A adopts a cluster ID C A, being a number. Let a node send a join message during its suspended period. When A receives a join message from B and C A <C B, it sends a join message to its neighbors (in cluster C A ) before joining B. When C A >C B it sends a join message to B during B s active period. Note Once a join message reaches a whole cluster, merging two clusters is very fast. Merging means: re-adjust clocks.