Chapter 3: Client-Server Paradigm and Middleware

1 Chapter 3: Client-Server Paradigm and Middleware In order to overcome the heterogeneity of hardware and software in distributed systems, we need a software layer on top of them, so that heterogeneity is transparent to the application programmers. This layer is commonly called the middleware. See Figures 2.1, 4.1, and 5.1 in the Textbook. The lower part of this layer (see Figure 4.1) commonly starts with the interface to the Internet transport protocols, TCP (for streams) and UDP (for datagrams), and deals with standard data representations on transport connections. The upper part of middleware is concerned with facilitating intercommunication among applications on different platforms, and provides remote method invocation (RMI), remote procedure call (RPC), and other services such as naming, transaction management, security, etc. In this chapter we will discuss several important issues related to middleware, such as how to construct servers, how to request service, and how to locate servers. In particular, we will consider in some detail RPC, Java RMI, failures and service guarantees, stateless server, naming and name lookup service, Internet DNS (Domain Name System), etc. 3.1 Server Structures If a single server thread performs the service requested by many clients sequentially, then responses may be slow: For example, while it is blocked on reading a file for a client, all other clients must wait. To improve the response time, we may want to implement a multithreaded server. We can construct a server from a listener thread and a set of worker threads. The listener may act as a scheduler as well. (See Figure 1.) Incoming requests Listener/ scheduler Worker Message buffers Worker Worker Monitor Shared memory Figure 1: A server structure. Design choices: (a) A single server thread. If the service requires little time (e.g., a time server), or if there is no parallelism

CMPT401 Chapter 3, Summer 04 2 possible (i.e., no I/O operation is involved and there aren t multiple processors) in the server, then this may be the most appropriate choice. (b) A listener and a fixed number of workers. Received requests are stored in a request list. Each worker checks the list, removes one request from it, processes it, and returns a reply to the client s port specified in the request message. If the list is empty, the worker idles, waiting for a new request to arrive. When a request arrives, the listener stores it in the list and wakes up one of the waiting workers, if any. There is no overhead for creating/destroying threads, but if there are more requests than threads, some requests must wait. Also, resources (allocated to workers) are tied up even if fewer requests are being processed than the available workers. (c) A listener and a dynamically created set of workers (multiple threads). On receiving a request message, the listener forks a worker thread and hands the request to it. On completion, the worker dies. There is no need for the shared list, since each request is handed immediately to a new thread. Thus, there is no need to implement mutual exclusion (for accessing a shared list). May be preferable to (b) if the cost (delay) of creating/destroying a worker is small. Example 1: Disk server (low-level). It is desirable to have one listener/scheduler and one worker. Each request contains the cylinder number, surface number and the block numbers within the track. The listener puts a new request into a list or queue, based on the disk scheduling algorithm adopted. After servicing the current request, the worker checks the list for the next client request to be serviced. This is a special case of case (b) above with only one worker. In this example, there is no point having more than one worker thread, since the disk operation is sequential. However, a separate listener/scheduler can schedule the pending requests, say, according to their tack numbers. Example 2: Unix inetd daemon: The Internet super-server, inetd, is invoked at boot time. It consults the file /etc/inetd.conf (which is a subset of /etc/services), and creates one socket for each service and binds the appropriate port number to it. 1 inetd then invokes select on all these socket descriptors for read availability. select(int nfds, *readfds, *writefds, *timeout), where nfds is the number of file descriptors being selected, and readfds is a pointer to a file descriptor set (a bit pattern). The bit corresponding file descriptor (of a socket, e.g.) of interest is set to 1. On return from a call, it is replaced (overwritten) by a bit pattern indicating those file descriptors that are ready for input operation. 2 timeout points to a value specifying how long waiting should last; if it s 0, the call is non-blocking (returns immediately). When select indicates activity on a descriptor, inetd invokes accept on it and forks a new server process on the new descriptor returned by accept, 3 upon which it 1 Before BSD 4.2, there was a separate daemon for each service. 2 writefds is relevant for output operations that are ready, e.g., the window server in the next example. See the man page for select. 3 It dup s the new socket to file descriptors 0 and 1 (stdin and stdout) and exec s the appropriate server.

CMPT401 Chapter 3, Summer 04 3 proceeds to do select again. The new server process is the server daemon specified in the last field of each line in /etc/inetd.conf. port service type protocol userid daemon 21 ftp stream tcp nowait root /usr/etc/in.ftpd 23 telnet stream tcp nowait root /usr/etc/in.telnetd 37 time dgram udp wait root internal (inetd itself) 80 http stream tcp nowait root /usr/etc/in.httpd 513 login stream tcp nowait root /usr/etc/in.rlogind 514 shell stream tcp nowait root /usr/etc/in.rshd 517 talk dgram udp nowait root /usr/etc/in.talkd wait in the above table means the next request must wait, because inetd itself serves the current request. So, this server structure is a combination of the design choices (a) and (c) above. The second last column specifies the uid that the server process is to be run as, so that it has the required privileges. login, shell, and talk in the above table are only for Unix systems. Example 3: Window servers: A window is an area of a screen that is connected with a local (or remote) application. A window server manages a window, providing operations such as graphical display, pointing, dragging, zooming, menu selection, etc. The X Window was developed at MIT around 1984. It uses the X Protocol, which is device- and networkindependent and is a de facto standard for Unix-based systems. The X Protocol has been implemented on TCP/IP, Ethernet, token ring, and many other data link protocols. The X Window system includes Xlib, a library of graphics and windowing routines, which must be linked with client applications. Other examples of window servers are Apple Macintosh interface and Microsoft Windows. A single X Window server may manage several windows for its clients, which are application processes. The server issues select (see the previous example) to constantly watch the events on the keyboard and the mouse. If the event detected is right mouse button down, for example, a thread is created which calls menu display, passing the menu structure, depending on the window to which the mouse is pointing. This procedure pops up the menu and highlights items as the user moves the cursor. The thread then executes the selected command. 3.2 RPC Suppose that a procedure a user wants to call is not locally available for some reason, such as lack of resources needed. It is convenient to make the semantics of communication between remote programs as similar as possible to a normal procedure call (something familiar to and well-understood by many programmers). See Fig. 2. Hence the name remote procedure call (RPC). The client program makes a normal procedure call, say, proc(a,b), with the intention of invoking the remote procedure proc( ) on some other machine. From the client s point of view, this looks just like calling a (local) procedure and then waiting for it to return. 4 This level of abstraction relieves the application programmer of the concern 4 In Mach, for example, msg rpc() invoked by a client blocks until a reply returns, reducing the number of operations that the client need to invoke from two (a send followed by a receive) to one.

CMPT401 Chapter 3, Summer 04 4 Program Procedure i = sum(3,8) sum(j,k) int j,k; { return j+k; } Local Procedure Call CLIENT PROCESS i = sum(3,8) SERVER PROCESS sum(j,k) int j,k; { return j+k; } sum 3 8 sum 3 8 Client Machine Server Machine Remote Procedure Call Figure 2: Local and remote procedure calls. for data transmission and communication protocols. We will now discuss how to implement RPCs. If an RPC is used in a client program, the linker links a library procedure called the client stub, which is executed as a part of the client process. See Fig. 3. The client stub 1. composes a message (containing the procedure id 5 and the parameters) to be sent to the server, 2. sends the message to the server and waits by a blocking receive (or a single combined call, e.g., msg rpc in Mach), and 3. delivers the returned result to the calling program. The server stub on the server machine 1. extracts the procedure id and the input parameters, 2. invokes the called procedure with these parameters, using the standard (local) procedure call, 3. composes a reply message, and 4. sends it back to the client process. 3.2.1 Implementation Issues 5 Each procedure in an interface (see 3.3.2) has a unique id, e.g., 0, 1, 2,....

CMPT401 Chapter 3, Summer 04 5 User level Client stub Return CLIENT PROCESS Program Call Call SERVER PROCESS Procedure Unpack Pack Invisible Unpack result para s para s to the user Pack result Return Server stub Kernel Client machine Server machine Figure 3: Client and server stubs. 1. Data format conversion: If the client and the server platforms are different, data format conversion is necessary. As mentioned in Chapter 1, Intel machines use the little endian representation, while Sun workstations use the big endian representation. At a higher, language level, an array, for example, may be stored differently (row-major or column-major form) in the client s program and the remote procedure. To facilitate data format conversion, Sun developed the external Data Representation (XDR) as the standard data representation used in Sun s Network File System, NFS. The sender represents the parameters in XDR, and the receiver converts them back to the local representation. CORBA (Common Object Request Broker Architecture) uses CDR (Common Data Representation, see Text 4.3.1). 2. Marshalling (also called serialization or flattening) and unmarshalling (deserialization): The procedure name and parameters are marshalled by the sender and unmarshalled or deserialized by the receiver. For serialization operations in CORBA and Java, see Text 4.3. 3. Name server: How do you find the server that can run the procedure in question? The SUN portmapper with a well-known address (port=111 on each machine) maintains a list of registered servers and their port numbers. More general name servers (e.g., DNS) will be discussed in a later section. 3.2.2 Interface Definition Informally, the boundary between two interacting objects is called an interface. The interface between a car and the driver consists of the steering wheel, directional signal lever, the gear lever, etc. The protocol corresponds to the rules on how to operate them. The interface between the terminal user and the computer system is the screen, the keyboard and mouse and the signals exchanged between them. The interface between the application programmer and the OS is the system calls, including the name of operation, the parameters, etc.

CMPT401 Chapter 3, Summer 04 6 In the present context, we are interested in the RPC interface between clients and the server that executes procedures on their behalf. It is important to note that the interface can be defined independently of the implementation of the objects involved. An RPC interface consists of: 1. the service (or interface) identifier 2. list of procedure names and the order and types of their parameters, and 3. type definitions and constant declarations. Here is a simplified example. program FILESERVER{ version VERSION { Data READ(readargs) =1; //proc. 1 void WRITE(writeargs) =2; //proc. 2 }=2; //version no. }=1234; //program no. An (RPC) interface definition language (IDL) is often an addition to a conventional programming language, providing a formal way of defining interfaces. The interface definition is preprocessed by an (RPC) interface compiler (also called a stub generator), to generate client and server stubs, as well as the header file. The client stub may be generated in one language, while the server stub may be generated in another language. This enables the server and client programs to be compiled separately and to communicate via RPC. See Fig. 4. No existing programming language is usable as a standard IDL, since it must contain all possible types in any other existing languages. For example, if the C were to be used as an IDL, and a variable in an interface, say X, were defined to be of type char*, then it could mean a null-terminated character string, a pointer to a character, or a pointer to an array of characters. 6 Suppose another language, such as Java, which has two distinct types for character string and a pointer to a character, were used to code the server. Then, in the header file it generates, the IDL compiler wouldn t be able to correctly declare the type for X, since it doesn t know the intention of the programmer who wrote the interface definition. Also, in and out types are useful, in general, in order to indicate if a variable is used to represent an input to or an output from the called procedure, respectively. Note that only the procedure names and the types of its parameters are specified in an interface definition file, since what these procedures actually do (i.e., their semantics) is irrelevant to generating the stubs. The following are some well-known IDLs and/or stub generators: 6 This ambiguity is eliminated in MiG (Mach Interface Generator), for example, by introducing a separate string type, MSG TYPE STRING. MiG also has 3 types for I/O specification: in, out, inout. The default is in.

CMPT401 Chapter 3, Summer 04 7 Interface Definition File IDL Compiler header.h #Include #Include Client Stub Client Program Server Program Server Stub Compiler Compiler Compiler Compiler Client Stub Object File Client Object File Server Object File Server Stub Object File Linker Runtime Library Runtime Library Linker Client Binary Server Binary Figure 4: Programming a server and a client. CORBA: becoming very popular. (See the Communications of the ACM, Oct. 1998) rpcgen: used in Sun RPC. MiG (Mach Interface Generator): uses a subset of MatchMaker, which has also been added to C, Pascal, Ada, Common Lisp. rmic (RMI compiler): used in Java RMI (Remote Method Invocation). RMI and CORBA are for remote method invocation, which is more general than RPC. (See the next section.) 3.2.3 Sun Portmapper Sun RPC employs a portmapper as its simple name server. Since the portmapper itself is a server, its interface is specified as follows (not all procedures are shown): program PMAP_PROG { version PMAP_VERS { void PMAPPROC_NULL(void) = 0; bool PMAPPROC_SET(mapping) = 1; int PMAPPROC_GETPORT(mapping) = 3; } = 2; } = 100000; When a new server starts up, it would invoke PMAPPROC SET, to register its name, version number and port with the portmapper. A client would invoke PMAPPROC GETPORT on the

CMPT401 Chapter 3, Summer 04 8 well-known port number (111) of the portmapper, in order to find the port number for the service it seeks, providing the server name and its version number.