Distributed Systems Joseph Spring School of Computer Science Distributed Systems and Security Areas for Discussion Definitions Operating Systems Overview Challenges Heterogeneity Limitations and 2 Definitions Coulouris, Dollimore & Kindberg: A distributed system is one in which components located at networked computers communicate and coordinate their actions only by passing messages Tanenbaum & van Steen A distributed system is a collection of independent computers that appears to its users as a single coherent system Silberschatz, Galvin & Gagne A distributed system is a collection of loosely coupled processors interconnected by a communication network 3 Why Build Distributed Systems? Silberschatz, Galvin and Gagne Four reasons: Resource Sharing Computational Speedup Reliability Communication 4 Monolithic Operating Systems early operating systems (mid-late 1950 s) generally designed with little concern about structure and little experience of building large software systems Communication? User documentation? Problems caused by mutual dependence and interaction grossly underestimated Lack of structure unsustainable as o/s grew to massive proportions Layered Operating Systems Functions are organised hierarchically Interaction only takes place between adjacent layers Most or all of the layers execute in kernel mode Users File System Interprocess Communication I/O and Device Management Virtual Memory Primitive Process Management Hardware 5 6 1
Microkernel Based Operating Systems Popularised by its use in the MACH O/S Approach provides high degree of flexibility and modularity Approach used in Windows NT Claims modularity, portability as key benefits Microkernel surrounded by number of compact subsystems easing task of implementing NT on variety of platforms 7 Microkernel Based Operating Systems Philosophy - only absolutely essential core operating system functions should be in the kernel Less essential services and applications are built on the microkernel and operate in user mode Characteristic here is that many services traditionally thought of as part of the O/S are now external subsystems that interact with the kernel and with each other Device drivers, File systems, Virtual memory manager, Windowing system, Security services, 8 Microkernel Based Operating Systems esses Client Proce * * * Device Driv vers File Server Microkernel Hardware Process Serv ver Virtual Mem mory 9 Microkernel Benefits Uniform Interfaces Extensibility Flexibility Portability Reliability Distributed System Support Object-Orientated Operating System (OOOS) W. Stallings, Operating Systems Prentice Hall 10 Microkernel Architecture Replaces traditional vertical layered stratified O/S with horizontal one O/S components external to microkernel are implemented as server processes, interacting with each other on a peer basis, by passing messages through the microkernel Hence microkernel operates as a message exchange Microkernel Validates messages Passes messages between components Grants access to H/W 11 Operating Systems Hand Outs Distributed Operating Systems UniprocessorO/S Multiprocessor O/S MulticomputerO/S Distributed Shared Memory Systems Network Operating Systems Shared Memory v Message Passing Systems A. S. Tanenbaum & M. van Steen, Distributed Systems, Prentice Hall 12 2
Challenges Coulouris, Dollimore& Kindberg discuss the above goals (and others) in terms of challenges: Heterogeneity Heterogeneity and Mobile Code Openness Security Scalability Failure Handling Concurrency 13 Heterogeneity Variety and difference in the collection of computers and networks employed in for example the internet Applies to each of the following: Networks, H/W, O/S s, Programming Languages, Implementations by different developers Networks Differences are masked on the internet since all computers attached to them use the internet protocols to communicate with each other H/W Data types may be represented in different ways by different H/W Little endian v big endian byte ordering of integers These must be resolved in order to communicate using different H/W 14 Heterogeneity O/S s All have to supply implementation of internet protocols Do not necessarily provide the same Application Programming Interface to these protocols Calls for exchanging gmessages in UNIX different from those in Windows Programming Languages Different programming languages use different representations for characters and data structures Arrays and records These need to be addressed if programs written in different languages are to communicate 15 Heterogeneity Implementations by different developers Programs written by different developers cannot communicate unless they use common standards Standards need to be agreed and adopted, as have the internet protocols 16 Heterogeneity and The software layer that provides a programming abstraction as well as masking the heterogeneity of the underlying networks, H/W, O/S s and Programming Languages Examples CORBA Java RMI Supports single programming language Most middleware implemented over the internet protocols which in turn mask the difference of the underlying networks All middleware deals with the differences in O/S s and H/W Provides a uniform computational model for use by programmers of servers and distributed applications 17 the concealment from the user and the application programmer of the separation of components in a distributed system, so that the system is perceived as a whole rather than as a collection of independent components. ANSA Reference Manual [ANSA 1989] and the International Standards Organisation Reference Model for Open Distributed Processing (RM- ODP) [ISO 1992] identify 8 forms of transparency: 18 3
Access Location Concurrency Replication Failure Mobility Performance Scaling Network 19 Access Enables local and remote resources to be accessed using identical operations Location The accessing of resources without knowledge of their location Concurrency The concurrent use of shared resources by several processes without interference between the processes 20 Access Example To send an integer from an Intel based workstation to a Sun SPARC machine means that we take into account that Intel orders its bytes in little endian format (high order bit is transmitted first) and that the SPARC processor uses big endian format ( low order bit is transmitted first) Computer systems may run different O/S each having their own name filing conventions. Differences in naming conventions and how files can be manipulated should be hidden from the user 21 Location Example Naming plays an important role in achieving location transparency For Example: http://www.example.com/index.html The address gives no indication of location of main server Unlike URL s which end with.uk,.ie, 22 Replication Multiple instances of resources used to increase reliability and performance without knowledge of replications by users or application programmers Failure The concealment of faults, allowing users and application programs to complete tasks despite the failure of H/W and/or S/W components Mobility Allows movement of resources and clients within a system without affecting the operation of users or programs 23 Replication Note: Plays an important role in Distributed Systems Resources may be replicated in order to Increase availability Improve performance by placing copies close to demand Hides fact that several copies of a resource exist To support replication transparency all replicas must have same name System supporting replication transparency generally supports location transparency too, otherwise impossible to refer to replicas at different locations 24 4
Failure Note: Masking failures one of hardest issues in distributed systems Main difficulty Inability to distinguish between a dead resource and a painfully slow one Example Contacting a busy web server Browser will eventually time out & report web page unavailable User cannot conclude server is really down 25 Mobility Example 1 The reference to index.html in the address http://www.example.com/index.html gives no information as to how long it has been at this location or whether it has recently moved This is an example of mobility / migration transparency 26 Mobility Example 2 Where resources can be relocated whilst they are being accessed without the user or application noticing Mobile users continuing to use wireless laptop whilst moving without being disconnected An example of mobility / relocation transparency 27 Performance Allowing the system to be reconfigured to improve performance as loads vary Scaling Allowing the system and applications to expand in scale without change to the system structure or the application algorithms Access and Location are often referred to collectively as Network 28 Tanenbaumand van Steen A distributed system that is able to present itself to users and applications as if it were only a single computer system is said to be transparent Software Architecture Originally: the structuring of software as layers or modules in a single computer Recently: in terms of services offered and requested between processes in same or different computer in terms of service layers Platform Applications, Services Operating Systems Computer and Network Hardware 29 30 5
a layer of software whose purpose is to mask heterogeneity and to provide a convenient programming model to application programmers Represented by processes or objects interacting i in a set of computers to implement communication and resource sharing support for distributed applications Concerned with providing useful building blocks for the construction of software components that can work together 31 Limitations Many distributed app s rely entirely on the services provided by available middleware to support communication and data sharing needs Much has been achieved through the development of middleware however some aspects of the dependability of systems require support at the application level A similar point is made by Saltzer, Reed and Clark, regarding the design of distributed systems, the end-to-end argument 32 Limitations some communication-related functions can be completely and reliably implemented only with the knowledge and help of the application standing at the end points of the communication system. Therefore, providing that t function as a feature of the communication system itself is not always sensible. (An incomplete version of the function provided by the communication system may sometimes be useful as a performance enhancement) Saltzer, Reed and Clark 1984 33 Limitations The argument is counter to the view that all communication activities can be abstracted away from the programming of applications by the introduction of appropriate middleware layers For SR&C correct behaviour in distributed programs depend upon checks, error correction mechanisms and security measures at many levels Checks within communication system only will be only partially correct Same work is therefore likely to be replicated inapplication programs leading to wasteful programming, unnecessary complexity and computational redundancy 34 Limitations See following references: Saltzer,J.H., Reed D.P. and Clarke, D. 1984 End-to-End Arguments in System Design, ACM Transactions on Computer Systems, Vol.2, No.4, pp. 277 288 IPC - InterProcess Communication http://www.reed.com 35 / Design Issues IPC 36 6
may be represented as consisting of just two layers, as shown below: Applications, Services RMI and RPC Request Reply Protocol Marshalling and Data Representation UDP and TCP / Operating System / Design Issues IPC 37 RMI and RPC Layer Concerned with integrating communication into a programming paradigm by providing RMI or RPC Remote Method Invocation Allows an object to invoke a method in an object in a remote process Examples of systems for RMI are CORBA and Java RMI Remote Procedure Call Allows a client to invoke a procedure in a remote server / Design Issues IPC 38 Request-Reply Protocol, Marshalling and the External Data Representation Layer Concerned with suitable protocols that support client-server (and group communication) Concerned with the translation of objects and data structures into a form suitable for sending in messages over the network Takes into account different computers may use different representations for simple data items CDK consider a suitable representation for object references in a distributed system / Design Issues IPC 39 IPC IPC best provided through a message-passing system? See CDK Section 3.3.2 - Data Streaming, p74 - audio and video streams Function of Message-Passing System To facilitate communication between processes without need to resort to shared data IPC facility provides at least two operations: send(message) receive(message) / Design Issues IPC 40 Send and Receive A process p performs a send by inserting a message m into its outgoing message buffer The communication channel transports m to process q s incoming message buffer Process q performs a receive by taking m from its message buffer and delivering it Process p send m Communication Channel Outgoing Message Buffer Process q receive m Incoming Message Buffer Request-Reply Protocol dooperation, getrequest, sendreply dooperation.. (wait). (continuation) Request Message Reply Message getrequest select object execute method sendreply / Design Issues IPC 41 / Design Issues IPC 42 7
Request-Reply Protocol dooperation used by clients to invoke remote operations. Specifies remote object, method to invoke and additional information (arguments) required by method It is assumed that client carries out marshalling of request and unmarshalling of reply getrequest is used by the server to acquire service requests sendreply is used to send the reply message to the client once the server has invoked the method in specified object / Design Issues IPC 43 External Data Representation Defined to be: An agreed standard for the representation of data structures and primitive values Information stored in running programs is represented as Data Structures for example by sets of interconnected objects Information in messages given as sequence of bytes Irrespective of the form of communication used data structures must be flattened (converted to a sequence of bytes) prior to transmission and rebuilt once they arrive at their destination / Design Issues IPC 44 External Data Representation Possible Method for Exchange of Data Values Convert values to agreed external format prior to sending To local form upon receipt If computers known to be of same type then omit conversion Alternatively Transmit values in senders format with detail of format used Let recipient convert the values as necessary Note: bytes are never altered during transmission To support RMI/RPC must be able to Flatten any data types that can be passed as an argument or as result Agree format / Design Issues IPC 45 Marshalling/Unmarshalling Marshalling is the process of taking a collection of data items and assembling them into a form suitable for transmission in a message The translation of structured data items and primitive values into an external data representation ti Unmarshalling is the process of disassembling the received messages to produce an equivalent collection of data items at the destination The generation of primitive values from their external data representation and the rebuilding of the data structures / Design Issues IPC 46 and The originators of RPC Birrell and Nelson [1984] intended that RPC be as much like local procedure calls as possible Have no distinction between local and remote procedural calls To make semantics of RPC like local proc. calls all necessary calls to Marshalling and message passing procedures hidden from the programmer making the call Retransmissions due to timeout transparent to the caller / Design Issues IPC 47 / Design Issues IPC 48 8
has been extended to apply to distributed objects This involves HIDING: marshalling and message passing operations The task of locating and contacting remote objects Remote invocations are more vulnerable to failure than local ones They involve: Network(s) another Computer another Process Possibility of no reply always exists In the case of failure difficult to identify the problem Network or remote process Objects making remote invocations must be able to recover from such failures / Design Issues IPC 49 / Design Issues IPC 50 Latency for remote invocations Much greater than for local invocations Programs using remote invocations need to take this into account Perhaps by minimising remote interactions Choice regarding transparency for remote invocations is available to designers of IDL s (Interface Definition Languages) Interface definition languages are designed to allow objects implemented in different languages to invoke one another (See CDK pp - 6 Example CORBA IDL Chap ) / Design Issues IPC 51 Current Consensus Remote invocations should be made transparent in the sense that Syntax of remote and local invocations should be the same The difference between local and remote objects should be expressed in their interfaces The knowledge that an object intended to be accessed by remote invocation also means that its designer must ensure that it can keep its state consistent whilst experiencing concurrent access from other clients / Design Issues IPC 52 Summary Definitions Operating Systems Overview Challenges Heterogeneity Limitations and 53 9