Communication. Layered Protocols. Topics to be covered. Layered Protocols. Introduction

Similar documents
Communication. Overview

Communication. Outline

Communication. Distributed Systems Santa Clara University 2016

IPC. Communication. Layered Protocols. Layered Protocols (1) Data Link Layer. Layered Protocols (2)

Advanced Topics in Distributed Systems. Dr. Ayman A. Abdel-Hamid. Computer Science Department Virginia Tech

Last Class: RPCs. Today:

Parallelism. Master 1 International. Andrea G. B. Tettamanzi. Université de Nice Sophia Antipolis Département Informatique

Distributed Systems. Chapter 02

Interprocess Communication Tanenbaum, van Steen: Ch2 (Ch3) CoDoKi: Ch2, Ch3, Ch5

Last Class: RPCs and RMI. Today: Communication Issues

DISTRIBUTED COMPUTER SYSTEMS

Verteilte Systeme (Distributed Systems)

Chapter 4 Communication

Distributed Systems. Communication (2) Schedule of Today. Distributed Objects. Distributed Objects and RMI. Corba IDL Example

Distributed Systems. Communication (2) Lecture Universität Karlsruhe, System Architecture Group

Distributed Systems Principles and Paradigms. Chapter 04: Communication

Distributed Systems Principles and Paradigms. Chapter 04: Communication

Today CSCI Remote Method Invocation (RMI) Distributed Objects

Distributed Systems. Edited by. Ghada Ahmed, PhD. Fall (3rd Edition) Maarten van Steen and Tanenbaum

May Gerd Liefländer System Architecture Group Universität Karlsruhe (TH), System Architecture Group

Distributed Systems COMP 212. Lecture 15 Othon Michail

Overview. Communication types and role of Middleware Remote Procedure Call (RPC) Message Oriented Communication Multicasting 2/36

Communication Basics, RPC & RMI. CS403/534 Distributed Systems Erkay Savas Sabanci University

Distributed Information Processing

Remote Invocation. Today. Next time. l Overlay networks and P2P. l Request-reply, RPC, RMI

Chapter 4 Communication

Last Class: RPCs. Today:

Outline. EEC-681/781 Distributed Computing Systems. The OSI Network Architecture. Inter-Process Communications (IPC) Lecture 4

DISTRIBUTED COMPUTER SYSTEMS

Distributed Systems Inter-Process Communication (IPC) in distributed systems

Distributed Systems Principles and Paradigms

Chapter 4 Communication

SAI/ST course Distributed Systems

Advanced Distributed Systems

CHAPTER - 4 REMOTE COMMUNICATION

Distributed Object-Based Systems The WWW Architecture Web Services Handout 11 Part(a) EECS 591 Farnam Jahanian University of Michigan.

Today CSCI Communication. Communication in Distributed Systems. Communication in Distributed Systems. Remote Procedure Calls (RPC)

Communication. Distributed Systems IT332

MODELS OF DISTRIBUTED SYSTEMS

MODELS OF DISTRIBUTED SYSTEMS

CSci Introduction to Distributed Systems. Communication: RPC

Distributed Systems Theory 4. Remote Procedure Call. October 17, 2008

Distributed Systems 8. Remote Procedure Calls

MTAT Enterprise System Integration. Lecture 2: Middleware & Web Services

COMMUNICATION PROTOCOLS: REMOTE PROCEDURE CALL (RPC)

Architecture of Software Intensive Systems

Communication in Distributed Systems

COMMUNICATION IN DISTRIBUTED SYSTEMS

Dr Markus Hagenbuchner CSCI319 SIM. Distributed Systems Chapter 4 - Communication

DS 2009: middleware. David Evans

Remote Invocation. Today. Next time. l Indirect communication. l Request-reply, RPC, RMI

Distributed Objects and Remote Invocation. Programming Models for Distributed Applications

Distributed Systems. How do regular procedure calls work in programming languages? Problems with sockets RPC. Regular procedure calls

Communication. Communication. Distributed Systems. Networks and protocols Sockets Remote Invocation Messages Streams. Fall /10/2001 DoCS

Module 3 - Distributed Objects & Remote Invocation

Operating Systems. 18. Remote Procedure Calls. Paul Krzyzanowski. Rutgers University. Spring /20/ Paul Krzyzanowski

Verteilte Systeme (Distributed Systems)

Architecture of Distributed Systems

ECE 650 Systems Programming & Engineering. Spring 2018

Today: Distributed Objects. Distributed Objects

03 Remote invoaction. Request-reply RPC. Coulouris 5 Birrel_Nelson_84.pdf RMI

Networked Systems and Services, Fall 2018 Chapter 4. Jussi Kangasharju Markku Kojo Lea Kutvonen

EEC-682/782 Computer Networks I

MOM MESSAGE ORIENTED MIDDLEWARE OVERVIEW OF MESSAGE ORIENTED MIDDLEWARE TECHNOLOGIES AND CONCEPTS. MOM Message Oriented Middleware

Multimedia Networking

Lecture 8: February 19

416 Distributed Systems. RPC Day 2 Jan 12, 2018

416 Distributed Systems. RPC Day 2 Jan 11, 2017

Middleware and Interprocess Communication

Transport Layer (TCP/UDP)

Distributed Systems Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2016

Remote Procedure Calls

CS454/654 Midterm Exam Fall 2004

Lecture 8: February 17

Dr. Jack Lange Computer Science Department University of Pittsburgh. Fall Distributed System Definition. A distributed system is

Outline. Interprocess Communication. Interprocess Communication. Communication Models: Message Passing and shared Memory.

Interprocess Communication

Lecture 5: Object Interaction: RMI and RPC

Communication in Distributed Systems

Communication Paradigms

TSIN02 - Internetworking

Chapter 4: Processes

Networking and Internetworking 1

CHAPTER 3 - PROCESS CONCEPT

Process Concept. Chapter 4: Processes. Diagram of Process State. Process State. Process Control Block (PCB) Process Control Block (PCB)

Chapter 4: Processes

Distributed Systems Principles and Paradigms. Distributed Object-Based Systems. Remote distributed objects. Remote distributed objects

Remote Procedure Call

Diagram of Process State Process Control Block (PCB)

EEC-484/584 Computer Networks. Lecture 16. Wenbing Zhao

Chapter 4: Processes. Process Concept

Structured communication (Remote invocation)

CCNA 1 Chapter 7 v5.0 Exam Answers 2013

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Remote Procedure Call. Chord: Node Joins and Leaves. Recall? Socket API

Message Passing vs. Distributed Objects. 5/15/2009 Distributed Computing, M. L. Liu 1

Chapter 3: Processes. Operating System Concepts 9 th Edition

Large Systems: Design + Implementation: Communication Coordination Replication. Image (c) Facebook

Unit 2.

The Client Server Model and Software Design

Remote Invocation. To do. Request-reply, RPC, RMI. q Today q. q Next time: Indirect communication

Transcription:

Distributed Systems, Fall 2003 1 Introduction Interprocess communication is at the heart of all distributed systems Communication Based on low-level message passing offered by the underlying network Protocols: rules for communicating processes structured in layers Four widely-used models: Remote Procedure Call (RPC) Remote Method Invocation (RMI) Message-Oriented Middleware (MOM) streams Distributed Systems, Fall 2003 2 Topics to be covered Layered Protocols Remote Procedure Call (RPC) Remote Method Invocation (RMI) Message-Oriented Middleware (MOM) Streams Layered Protocols Low-Level Transport Application Middleware Distributed Systems, Fall 2003 3 Distributed Systems, Fall 2003 4 Layered Protocols A wants to communicate with B A builds a message in its own address space A executes a call to the OS to send the message The OSI Model Designed to allow open systems to communicate Two general type of protocols: Connection-oriented: before exchanging data, the sender and the receiver must establish a connection (e.g., telephone) Connectionless: no setup in advance (e.g., sending an email) Need to agree on the meaning of the bits being sent The ISO OSI or the OSI model Each layer provides an interface to the one above Message send (downwards) Message received (upwards) Each layer adds a header Distributed Systems, Fall 2003 5 Distributed Systems, Fall 2003 6

Distributed Systems, Fall 2003 7 The OSI Model Low-level Layers These layers implement the basic functions of a computer network The information in the layer n header is used for the layer n protocol Independence among layers Protocol suite or protocol stack: collection of protocols used in a particular system OSI protocols not so popular, instead Internet protocols (e.g., TCP and IP) reference model Lower-level Distributed Systems, Fall 2003 8 Low-level Layers Low-level Layers: The Data Link Layer Physical layer: Concerns with transmitting 0 and 1s Standardizing the electrical, mechanical and signaling interfaces so that when A sends a 0 bit, it is receives as a 0 Example standard: RS-232-C for serial communication lines the specification and implementation of bits, and their transmission between sender and receiver Data link layer: Group bits into frames and sees that each frame is correctly received Puts a special bit pattern at the start and end of each frame (to mark them) as well as a checksum Frames are assigned sequence numbers prescribes the transmission of a series of bits into a frame to allow for error and flow control Discussion between a receiver and a sender in the data link layer. A tries to sends two messages, 0 and 1, to B Distributed Systems, Fall 2003 9 Distributed Systems, Fall 2003 10 Network layer: Low-level Layers Routing: choose the best ( delay-wise ) path Transport Protocols Turns the underlying network into something that an application developer can use Example protocol at this layer: connectionless IP (part of the Internet protocol suite) IP packets: each one is routed to its destination independent of all others. No internal path is selected or remembered describes how packets in a network of computers are to be routed. NOTE For many distributed systems, the lowest level interface is that of the network layer. Distributed Systems, Fall 2003 11 Distributed Systems, Fall 2003 12

Distributed Systems, Fall 2003 13 Reliable connection Transport Layer The transport layer provides the actual communication facilities for most distributed systems. Breaks a message received by the application layer into packets and assigns each one of them a sequence number and send them all Header: which packets have been sent, received, there is room for, need to be retransmitted Reliable connection-oriented transport connections built on top of connection-oriented (all packets arrive in the correct sequence, if they arrive at all) or connectionless network services Transport Layer Standard (transport-layer) Internet protocols: Transmission Control Protocol (TCP): connection-oriented, reliable, streamoriented communication (TCP/IP) Universal Datagram Protocol (UDP): connectionless, unreliable (best-effort) datagram communication (just IP with minor additions) TCP vs UDP Works reliably over any network Considerable overhead use UDP + additional error and flow control for a specific application Distributed Systems, Fall 2003 14 Transport Layer: Client-Server TCP Higher-level Layers 3-way handshake: To reach an agreement on sequence numbering In practice, only the application layer is used Higher-level a) Normal operation of TCP. b) Transactional TCP (T/TCP) enhancement Distributed Systems, Fall 2003 15 Distributed Systems, Fall 2003 16 Application Layer Middleware Layer Intended to contain a collection of standard network applications, such as those for email, file transfer, etc From the OSI reference model, all distributed systems just applications Many application protocols are directly implemented on top of transport protocols, doing a lot of application-independent work. Middleware is invented to provide common services and protocols that can be used by many rich set of communication protocols, but which allow different applications to communicate Marshaling and unmarshaling of data, necessary for integrated systems Naming protocols, so that different applications can easily share resources Security protocols, to allow different applications to communicate in a secure way Scaling mechanisms, such as support for replication and caching Authentication protocols Distributed Systems, Fall 2003 17 Distributed Systems, Fall 2003 18

Distributed Systems, Fall 2003 19 Middleware Protocols RPC Basic RPC Model Parameter Passing Variations An adapted reference model for networked communication. Distributed Systems, Fall 2003 20 Basic idea: Some issues: Remote Procedure Call (RPC) Allow programs to call procedures located on other machines Calling and called procedures in different address spaces Parameter passing Crash of each machine Conventional Procedure Call Local procedure call: count = read(fd, buf, nbytes) 1: Caller: Push parameter values of the procedure on a stack + return address 2: Called procedure takes control 3: Called proc: Use stack for local variables, executes, pop local variables,save in cache return result, use return address 4: Caller: Pop results (in parameters) Principle: communication with local procedure is handled by copying data to/from the stack (with a few exceptions) Example: incr(i, i), initially i = 0 Call-by-Value, i = 0 Call-by-Reference, (push the address of the variable), i = 2 Call-by-Copy/Restore The value is copied in the stack as in call-by-value, and then copied back by the called procedure, i = 1 Distributed Systems, Fall 2003 21 Distributed Systems, Fall 2003 22 Client and Server Stubs RPC supports location transparency (the calling procedure does not know that the called procedure is remote) Client stub: local version of the called procedure called using the stack procedure it packs the parameters into a message and requests this message to be sent to the server (calls send) it calls receive and blocks till the reply comes back When the message arrives, the server OS passes it to the server stub Server Stub: typically waits on receive it transforms the request into a local procedure call after the call is completed, it packs the results, calls send it calls receive again and blocks Client and Server Stubs Call to procedure x -> call client stub for x -> client stub calls send and blocks -> upon receipt, the server stub gets control -> the server stub calls the local procedure x -> after the procedure x ends, control returns to the server stub that calls send, and then receive again and blocks -> the client OS, passes it to the client stub, copies it to the caller and returns Distributed Systems, Fall 2003 23 Distributed Systems, Fall 2003 24

Distributed Systems, Fall 2003 25 Steps of a Remote Procedure Call Remote procedure add(i, j) Parameter Passing 1. Client procedure calls client stub in normal way 2. Client stub builds message, calls local OS 3. Client's OS sends message to remote OS 4. Remote OS gives message to server stub 5. Server stub unpacks parameters, calls server 6. Server does work, returns result to the stub 7. Server stub packs it in message, calls local OS 8. Server's OS sends message to client's OS 9. Client's OS gives message to client stub 10. Stub unpacks result, returns to client A server stub may handle more than one remote procedure Two issues with parameter passing: Marshalling Reference Parameters Distributed Systems, Fall 2003 26 Parameter Passing Parameter marshaling: There s more than just wrapping parameters into a message: Passing Value Parameters An integer (one 32-bit word), and a four-character string (one 32-bit word) Example, integer 5 and string JILL Client and server machines may have different data representations (think of byte ordering) Wrapping a parameter means transforming a value into a sequence of bytes Client and server have to agree on the same encoding: - How are basic data values represented (integers, floats, characters) - How are complex data values represented (arrays, unions) Client and server need to properly interpret messages, transforming them into machine-dependent representations. a) Original message on the Pentium (right-to-left) b) The message after receipt on the SPARC (left-to-right) c) The message after being inverted. The little numbers in boxes indicate the address of each byte Distributed Systems, Fall 2003 27 Distributed Systems, Fall 2003 28 Passing Reference Parameters Pointer refers to the address space of the process it is being used Solutions: Forbid pointers and reference parameters in general Use copy in/copy out semantics: while procedure is executed, nothing can be assumed about parameter values (only Ada supports this model). RPC assumes all data that is to be operated on is passed by parameters. Excludes passing references to (global) data. One optimization, if the stubs know which are parameters are input and output parameters -> eliminate copying What about pointers to complex (arbitrary) data structures? Parameter Specification and Stub Generation Need to agree on: Encoding rules Actual exchange of messages (e.g., TCP/IP) Implement the stubs! Stubs for the same protocol and different procedures differ only in their interfaces to the applications Interface Definition Language (IDL) Distributed Systems, Fall 2003 29 Distributed Systems, Fall 2003 30

Distributed Systems, Fall 2003 31 Extensions Calls to local procedures Asynchronous RPC Doors Try to use the RPC mechanism as the only mechanism for interprocess communication (IPC). Doors are RPCs implemented for processes on the same machine A single mechanism for communication: procedure calls (but with doors, it is not transparent) Server calls door_create: registers a door, an id is returned fattach: associates a symbolic name with the id Client invokes a door using door_call, the id and any parameters The OS does an upcall to the server To return the result door_return Distributed Systems, Fall 2003 32 Asynchronous RPC Try to get rid of the strict request-reply behavior, and let the client continue without waiting for ananswer from the server. Differed Synchronous RPC Traditional RPC Asynchronous RPC Asynchronous RPC: the server immediately sends a reply back to the client the moment the RPC request is received, after which it calls the requested procedure Distributed Systems, Fall 2003 33 Deferred Synchronous RPC: two asynchronous RPCs combined The client uses asynchronous RPC to call the server The server uses asynchronous RPC to send the reply One way RPC: the client does not wait at all (reliability?) Distributed Systems, Fall 2003 34 Performing an RPC DCE RPC At-most-one semantics: no call is ever carried out more than once, even in the case of system crashes Let the developer concentrate on only the client- and server-specific code; let the RPC system (generators and libraries) do the rest. Idempotent remote procedure: a call may be repeated multiple times Distributed Systems, Fall 2003 35 Distributed Systems, Fall 2003 36

Distributed Systems, Fall 2003 37 Writing a Client and a Server IDL permits procedure declarations (similar to function prototypes in C).Type definitions, constant declarations, etc to provide information to correctly marshal/unmarshal paramters/results. Just the syntax (no semantics) A globally unique identifier Generate a prototype IDL with a unique id Edit the IDL, fill in the names of the remote procedures and their parameters Binding a Client to a Server 1. Locate the server machine 2. Locate the server on the machine: need to know an endpoint (port) on the server machine to which it can send messages A table of (server, endpoints) is maintained on each server machine by a process called the DCE daemon The server asks the OS for an endpoint and registers this endpoint with the DCE The client asks the DCE daemon at the server s machine to lookup the endpoint The steps in writing a client and a server in DCE RPC. Distributed Systems, Fall 2003 38 Distributed Objects Expand the idea of RPCs to invocations on remote objects Remote Object Invocation Data and operation encapsulated into an object Operations are implemented as methods, and through interfaces are accessible Distributed Objects Remote Object Invocation Parameter Passing An object offers only its interface to clients Object server is responsible for a collection of objects Client stub (proxy) implements interface Server skeleton handles (un)marshaling and object invocation Distributed Systems, Fall 2003 39 Distributed Systems, Fall 2003 40 Distributed Objects A client binds to a distributed object: an implementation of the object s interface, called a proxy, is loaded into the client s address space Proxy (analog to a client stub) Marshals method invocations into messages Un-marshals reply messages Actual object at a server machine: offers the same interface Skeleton (analog to server stub) Un-marshals requests to proper method invocations at the object s interface at the server Compile-time objects: Distributed Objects Objects defined as instances of a class Compiling the class definition results in code that allows to instantiate Java objects Language-level objects, from which proxy and skeletons are automatically generated. Depends on the particular language Runtime objects: Can be implemented in any language, but require use of an object adapter that makes the implementation appear as an object. Adapter: objects defined based on their interfaces Register an implementation at the adapter Note: the object itself is not distributed, aka remote object Distributed Systems, Fall 2003 41 Distributed Systems, Fall 2003 42

Distributed Systems, Fall 2003 43 Distributed Objects Transient objects: live only by virtue of a server: if the server exits, so will the object. Persistent objects: live independently from a server: if a server exits, the object s state and code remain (passively) on disk. Binding a Client to an Object Provide system-wide object references, freely passed between processes on different machines Reference denotes the server machine plus an endpoint for the object server, an id of which object When a process holds an object reference, it must first bind to the object Bind: the local proxy (stub) is instantiated and initialized for specific object implementing an interface for the object methods Two ways of binding: Implicit binding: Invoke methods directly on the referenced object Explicit binding: Client must first explicitly bind to object before invoking it (generally returns a pointer to a proxy that then becomes locally available Distributed Systems, Fall 2003 44 Binding a Client to an Object Static vs Dynamic RMI Remote Method Invocation (RMI) Distr_object* obj_ref; obj_ref = ; obj_ref-> do_something(); Distr_object objpref; Local_object* obj_ptr; obj_ref = ; obj_ptr = bind(obj_ref); obj_ptr -> do_something(); //Declare a systemwide object reference // Initialize the reference to a distributed object // Implicitly bind and invoke a method (a) //Declare a systemwide object reference //Declare a pointer to local objects //Initialize the reference to a distributed object //Explicitly bind and obtain a pointer to the local proxy //Invoke a method on the local proxy (b) Static invocation: the interfaces of an object are known when the client application is being developed If interfaces change, the client application must be recompiled Dynamic invocation: the application selects at runtime which method it will invoke at a remote object invoke(object, method, input_parameters, output_parameters) method is a parameter, input_parameters, output_parameters data structures Static: fobject.append(int) (a) (b) Example with implicit binding using only global references Example with explicit binding using global and local references Dynamic: invoke(fobject, id(append), int) id(append) returns an id for the method append Example uses: browsers, batch processing service to handle invocation requests Distributed Systems, Fall 2003 45 Distributed Systems, Fall 2003 46 Object References as Parameters When invoking a method with an object reference as a parameter, when it refers to a remote object, the reference is copied and passed as a value parameter (pass-by-reference) When the reference refers to a local object (i.e., an object in the same address space as the client) the referred object is copied as a whole and passed along with the invocation (pass-by-value) Message-Oriented Communication Persistence and Synchronicity Message-Oriented Transient (sockets, RMI) Message-Oriented Persistent/Message Queuing Distributed Systems, Fall 2003 47 Distributed Systems, Fall 2003 48

Distributed Systems, Fall 2003 49 Communication Alternatives RPC and RMI hide communication and thus achieve access transparency Asynchronous Communication Middleware Client/Server computing is generally communication: based on a model of synchronous Message-oriented middleware: Aims at high-level asynchronous communication: Client and server have to be active at the time of communication Client issues request and blocks until it receives reply Server essentially waits only for incoming requests, and subsequently processes them Drawbacks synchronous communication: Client cannot do any other work while waiting for reply Failures have to be dealt with immediately (the client is waiting) Processes send each other messages, which are queued Asynchronous communication: Sender need not wait for immediate reply, but can do other things Synchronous communication: Sender blocks until the message arrives at the receiving host or is actually delivered and processed by the receiver Middleware often ensures fault tolerance In many cases the model is simply not appropriate (mail, news) Distributed Systems, Fall 2003 50 Applications execute on hosts Example Communication System Communication servers are responsible for passing (and routing) messages between hosts Each host offers an interface to the communication system through which messages can be submitted for transmission Buffers at the hosts and at the communication servers Local mail server An electronic mailing system Persistent vs Transient Communication Persistent communication: A message is stored at a communication server as long as it takes to deliver it at the receiver. Transient communication: A message is discarded by a communication server as soon as it cannot be delivered at the next server, or at the receiver. Typically, all transport-level communication services offer only transient, a communication server corresponds to a store-and-forward router Distributed Systems, Fall 2003 51 Distributed Systems, Fall 2003 52 Messaging Combinations Messaging Combinations Persistent asynchronous Message stored persistently at the sending host or at the first communication server e.g., electronic mail systems Persistent synchronous Message stored persistently at the receiving host or the connected communication server (weaker) Transient asynchronous Transport-level datagram services (such as UDP) One-way RPC Receipt-based transient synchronous Sender blocks until the message is stored in a local buffer at the receiving host Distributed Systems, Fall 2003 53 Distributed Systems, Fall 2003 54

Distributed Systems, Fall 2003 55 Messaging Combinations Communication Alternatives Need for persistent communication services in particular when there is large geographical distribution (cannot assume that all processes are simultaneously executing) Delivery-based transient synchronous Sender blocks until the message is delivered to the receiver for further processing Asynchronous RPC Response-based transient synchronous Strongest form Sender blocks until it receives a reply message RPC and RMI Distributed Systems, Fall 2003 56 Outline Message-Oriented Transient Communication Transport-level sockets Message-Passing Interface (MPI) Berkeley Sockets Socket: a communication endpoint to which an application can write data to be sent out over the network and from which incoming data may be read server Socket Bind Primitive Meaning Create a new communication endpoint Attach a local address to a socket Message-Oriented Persistent Communication Message Queuing Model General Architecture Example (IBM MQSeries: check the textbook) Listen Accept Connect Send Receive Close Announce willingness to accept connections Block caller until a connection request arrives Actively attempt to establish a connection Send some data over the connection Receive some data over the connection Release the connection Socket primitives for TCP/IP. Distributed Systems, Fall 2003 57 Distributed Systems, Fall 2003 58 Berkeley Sockets Berkeley Sockets socket: creates a new communication endpoint for a specific transport protocol (the local OS reserves resources to accommodate sending and receiving messages for the specified protocol) bind: associates a local address with the newly created socket (e.g., the IP address of the machine + a port number) listen: (only in the case of connection-oriented communication) non-blocking call; allows the OS to reserve enough buffers for a specified max number of connections accept: blocks the server until a connection request arrives. When a request arrives, the OS creates a new socket and returns it to the caller. Then, the server can fork off a process that will subsequently handle the actual communication through the new connection. socket: (client) connect: attempt to establish a connection; specifies the transportlevel address to which a connection request is to be sent write/read: send/receive data close: called by both the client and the server Distributed Systems, Fall 2003 59 Distributed Systems, Fall 2003 60

Distributed Systems, Fall 2003 61 The Message-Passing Interface (MPI) The Message-Passing Interface (MPI) Some of the message-passing primitives of MPI Suitable for COWs and MPPs MPI designed for parallel applications and thus tailored to transient communication Assumes communication within a known group of processes, a (group_id, process_id) uniquely identifies a source or destination of a message Primitive MPI_bsend MPI_send MPI_ssend MPI_sendrecv MPI_isend MPI_issend MPI_recv MPI_irecv Meaning (transient-asynchronous) Append outgoing message to a local send buffer (blocking send) Send a message and wait until copied to local or remote buffer (delivery-based transient synchronous) Send a message and wait until receipt starts (response-based transient synchronous, RPC) Send a message and wait for reply Pass reference to outgoing message, and continue (for local MPI) Pass reference to outgoing message, and wait until receipt starts Receive a message; block if there are none Check if there is an incoming message, but do not block Distributed Systems, Fall 2003 62 Outline Message-Oriented Middleware Message-Oriented Transient Communication Transport-level sockets Message-Passing Interface (MPI) Message-Oriented Persistent Communication Message Queuing Model General Architecture Example (IBM MQSeries: check the textbook) Message-queuing systems or Message-Oriented Middleware (MOM) Targeted to message transfers that take minutes instead of seconds or milliseconds In short: asynchronous persistent communication through support of middleware-level queues Queues correspond to buffers at communication servers. Not aimed at supporting only end-users (as e.g., e-mail does). Enable persistent communication between any processes Distributed Systems, Fall 2003 63 Distributed Systems, Fall 2003 64 Message-Queuing Model Four combinations for loosely-coupled communications using queues. Message-Queuing Model Basic interface to a queue in a message-queuing system. Primitive Put Get Poll Notify Meaning Call by the sender Append a message to a specified queue Non-blocking Block until the specified queue is nonempty, and remove the first message Variations allow searching for a specific message in the queue Check a specified queue for messages, and remove the first. Never block. Install a handler (as a callback function) to be automatically invoked when a message is put into the specified queue. Often implemented as a daemon on the receiver s side Message can contain any data Addressing by providing a system-wide unique name of the destination queue Distributed Systems, Fall 2003 65 Distributed Systems, Fall 2003 66

Distributed Systems, Fall 2003 67 General Architecture of a Message-Queuing System General Architecture of a Message-Queuing System Messages are put only into local to the sender queues, source queues Messages can be read only from local queues A message put into a queue contains the specification o a destination queue Message-queuing system: provides queues to senders and receivers; transfers messages from their source to their destination queues. Queues are distributed across the network need to map queues to network address A (possibly distributed) database of queue names to network locations Queues are managed by queue managers Relays: special queue managers that operate as routers and forward incoming messsges to other queue managers overlay network Why routers? Only the routers need to be updated when queues are added or removes Allow for secondary processing of messages (e.g., logging for fault tolerance) Used for multicasting purposes Act as message brokers Distributed Systems, Fall 2003 68 Message Brokers Message broker: acts as an application-level gateway, coverts incoming messages to a format that can be understood by the destination application Contains a database of conversion rules Stream-Oriented Communication Streams Quality of Service Synchronization Distributed Systems, Fall 2003 69 Distributed Systems, Fall 2003 70 Support for Continuous Media So far focus on transmitting discrete, that is time independent data Discrete (representation media): the temporal relationships between data items not fundamental to correctly interpreting what the data means Example: text, still images, executable files Continuous (representation media): the temporal relationships between data items fundamental to correctly interpreting what the data means Examples: audio, video, animation, sensor data Example: motion represented by a series of images, in which successive images must be displayed at a uniform spacing T in time (30-40 msec per image) Correct reproduction showing the stills in the correct order and at a constant frequency pf 1/T images per sec Transmission Modes Different timing guarantees with respect to data transfer: Asynchronous transmission mode: data items are transmitted one after the other but no further timing constraints Discrete data streams, e.g., a file Synchronous transmission mode: there is a maximum end-to-end delay for each unit in a data stream E.g., sensor data Isochronous transmission mode: there is both a maximum and minimum end-to-end delay for each unit in a data stream (called bounded (delay) jitter) E.g., multimedia systems (audio, video) (Continuous) Data Stream: a connection oriented communication facility that supports isochronous data transmission Distributed Systems, Fall 2003 71 Distributed Systems, Fall 2003 72

Distributed Systems, Fall 2003 73 Stream Types Simple stream: only a single sequence of data Complex stream: several related simple streams (substreams) Data Streams Streams are unidirectional Considered as a virtual connection between a source and a sink Between (a) two process or (b) between two devices Relation between the substreams is often also time dependent Example: stereo video transmitted using two substreams each for a single audio channel Data units from each substream to be communicated pairwise for the effect of stereo Example: transmitting a movie: one stream for the video, two streams for the sound in stereo, one stream for subtitles Distributed Systems, Fall 2003 74 Data Streams Multiparty communication: more than one source or sinks Multiple sinks: the data streams is multicasted to several receivers Problem when the receivers have different requirements with respect to the quality of the stream Filters to adjust the quality of the incoming stream differently fo outgoing streams Quality of Service Quality of Service (Qos) for continuous data streams: timeliness, volume and reliability Difference between specification and implementation of QoS Distributed Systems, Fall 2003 75 Distributed Systems, Fall 2003 76 Flow Specification of QoS token-bucket model to express QoS Token: fixed number of bytes (say k) that an application is allowed to pass to the network Basic idea: tokens are generated at a fixed rate Tokens are buffered in a bucket of limited capacity When the bucket is full, tokens are dropped To pass N bytes, drop N/k tokens Characteristics of the Input maximum data unit size (bytes) Token bucket rate (bytes/sec) Token bucket size (bytes) Maximum transmission rate (bytes/sec) Flow Specification of QoS Service Required Loss sensitivity (bytes) Loss interval (µsec) maximum acceptable loss rate Burst loss sensitivity (data units) How many consecutive data units may be lost Minimum delay noticed (µsec) How long can the network delay delivery of a data unit before the receiver notices the delay Maximum delay variation (µsec) Maximum tolerated jitter Quality of guarantee Indicates how firm are the guarentess Distributed Systems, Fall 2003 77 Distributed Systems, Fall 2003 78

Distributed Systems, Fall 2003 79 Implementing QoS QoS specifications translate to resource underlying communication system reservations in the Stream Synchronization Given a complex stream, how do you keep the different substreams in synch? Two forms: (a) synchronization between a discrete and a continuous data stream and (b) synchronization between two continuous data streams The principle of explicit synchronization on the level of data units. Resources: bandwidth, buffers, processing capacity There is no standard way of (1) QoS specs, (2) describing resources, (3) mapping specs to reservations. A process that simply executes read and write operations on several simple streams ensuring that those operations adhere to specific timing and synchronization constraints Distributed Systems, Fall 2003 80 Stream Synchronization The principle of synchronization as supported by high-level interfaces. Extra Slides Distributed Systems, Fall 2003 81 Distributed Systems, Fall 2003 82 Example: IBM MQSeries All queues are managed by queue managers Queue managers are pair-wise connected through message channels Each of the two ends of a message channel is managed by a message channel agent (MCA) Attribute Transport type FIFO delivery Message length Setup retry count Delivery retries Example: IBM MQSeries Description Determines the transport protocol to be used Indicates that messages are to be delivered in the order they are sent Maximum length of a single message Specifies maximum number of retries to start up the remote MCA Maximum times MCA will try to put received message into queue Some attributes associated with message channel agents. Distributed Systems, Fall 2003 83 Distributed Systems, Fall 2003 84

Distributed Systems, Fall 2003 85 Example: IBM MQSeries Example: IBM MQSeries Primitive MQopen MQclose MQput MQget Description Open a (possibly remote) queue Close a queue Put a message into an opened queue Get a message from a (local) queue Primitives available in an IBM MQSeries MQI The general organization of an MQSeries queuing network using routing tables and aliases. Distributed Systems, Fall 2003 86 The DCE Distributed-Object Model Implementing QoS Resource reservation Protocol (RSVP) a transport-level control protocol for resource reservation in network routers 2-19 a) Distributed dynamic objects in DCE. b) Distributed named objects Distributed Systems, Fall 2003 87 Distributed Systems, Fall 2003 88