Introduction Distributed Systems IT332
2 Outline Definition of A Distributed System Goals of Distributed Systems Types of Distributed Systems
3 Definition of A Distributed System A distributed systems is a collection of independent computers that appears to its users as a single coherent system. Two aspects: (1) hardware - Independent computers (2) software users think they are dealing with a single system
4 Goals of Distributed Systems Making Resources Accessible Problems of sharing? Transparency To hide the fact that its processes and resources are physically distributed across multiple computers Openness To offer services according to standard rules that describe the syntax and semantics of those services Scalability
5 Transparency Transparency Access Location Migration Relocation Replication Concurrency Failure Description Hide differences in data representation and how a resource is accessed Hide where a resource is located Hide that a resource may move to another location Hide that a resource may be moved to another location while in use Hide that a resource is replicated Hide that a resource may be shared by several competitive users Hide the failure and recovery of a resource Aiming at full distribution transparency is not always good idea and difficult to achieve A Trade off between a high degree of transparency and the performance of the system
6 Openness An open distributed system is a system that is able to interact with services from other systems irrespective of the underlying environment Systems should conform to well-defined interfaces Systems should support portability of applications Systems should easily interoperate Systems should easily extensible
7 Scalability A distributed system is scalable if it will remain effective when the number of resources and users is significantly increased At least three aspects Number of users and/or processes (size scalability) Maximum distance between nodes (geographical scalability) Number of administrative domains (administrative scalability) Most systems account only, to a certain extent, for size scalability by using powerful servers. Today, the challenge lies in geographical and administrative scalability.
8 Scalability Techniques Distribution: Partition data and computations across multiple machines: Move computations to clients (Java applets) Decentralized naming services (DNS) Replication: Make copies of data available at different machines: Replicated file servers (mainly for fault tolerance) Replicated databases Mirrored Web sites Caching: Allow client processes to access local copies: Web caches (browser/web proxy) File caching (at server and client)
9 Scaling Problem Having multiple copies (cached or replicated), leads to inconsistencies modifying one copy makes that copy different from the rest. keeping copies consistent and in a general way requires global synchronization on each modification. Global synchronization makes large-scale solutions practically impossible. If we can tolerate inconsistencies, we may reduce the need for global synchronization.
10 Pitfalls When Developing Distributed Systems False assumptions made by first time developer: The network is reliable. The network is secure. The network is homogeneous. The topology does not change. Latency is zero. Bandwidth is infinite. Transport cost is zero. There is one administrator.
11 Types of Distributed Systems Distributed Computing Systems Distributed Information Systems Distributed Pervasive Systems
12 Distributed Computing Systems Used for high performance computing tasks Cluster computing systems A collection of similar workstations or PCs connected by a high speed local area network (LAN) Each node runs the same OS Grid computing systems A collection of machines connected over a wide area network Each machine may be in a different administrative domain, and may have different hardware, OS, and network technology Support virtual organizations (Vos) A Vos defines a group of users/applications that have access to a specified group of resources, which may be distributed across many different computers, owned by many different organizations.
13 Cluster Computing An example of a cluster computing system.
Grid Computing 14
15 Distributed Information Systems Used to integrate networked applications in an organization Transaction processing systems Support distributed transactions A transaction contains operations such that either all of the operations are executed or none are executed A distributed transaction is a transaction that accesses objects managed by multiple servers Enterprise application integration Let applications communicate directly with each other Types of communication middleware: remote procedure call (RPC), remote method invocation (RMI), message oriented middleware (MOM)
16 Transaction Processing Systems A transaction processing (TP) monitor allows an application to access multiple servers/database
17 Enterprise Application Integration Middleware as a communication facilitator in enterprise application integration.
18 Distributed Pervasive Systems Part of our surroundings Nodes are often small, battery powered, mobile devices with only a wireless connection Laptops, smart phones, digital cameras, etc. Self managing: not managed through a system administrator and no human administrative control. Examples: smart homes, electronic health care systems, sensor networks
19 Health Care Systems New devices are being developed to monitor the well-being of individuals and to automatically contact physicians when needed. Personal health care systems are often equipped with various sensors organized in a (preferably wireless) body-area network (BAN). BAN should be able to operate while a person is moving, with no strings (i.e., wires) attached to immobile devices.
Health Care Systems (HCS) 20
21 Next Chapter Architecture Questions?!