The RHODOS Migration Facility 1

Size: px

Start display at page:

Download "The RHODOS Migration Facility 1"

Osborn Dawson
5 years ago
Views:

1 The RHODOS Migration Facility 1 Damien De Paoli and Andrzej Goscinski (ddp@deakin.edu.au, ang@deakin.edu.au) School of Computing and Mathematics Deakin University Geelong, Victoria 3217, Australia Abstract This paper looks at the design, implemntation and performance of RHODOS process migration facility which reflects RHODOS aspiration to support both load balancing and parallel execution on a distributed system. Following the presentation of the design issues, which reflect the requirements of both load balancing and parallel execution on a distributed system, research into an initial implementation and testing of RHODOS process migration facility will be presented. Performance measurements of this initial implementation have also been taken and the results are reported in this paper. A comparison with several modern distributed operating systems show that already our migration facility compares favourably. 1. This work was partly supported by the Australian Research Council Grant A and the Deakin University Research Grant

2 1 Introduction In a distributed system there are many workstations which are either idle or lightly loaded, while others are heavily loaded. Users working at heavily loaded workstations are not provided with a satisfactory service. In particular, their response time can be very long. The overall system performance and quality of service provided to an individual user can be improved if idle or lightly loaded workstations are used more efficiently and the computational load is transparently balanced [Goscinski 91]. A distributed system has only recently been identified as a viable platform for parallel execution of user programs instead of parallel computers [Ahuja et al. 86], [Bal et al. 89], [Trottenberg 93]. These can be achieved by using idle and/or lightly loaded workstations to run parallel sub tasks of a program. In other words, the overall performance of a distributed system and the performance of individual users (measured in response time and/or throughput) can be further improved if a user program will be executed in parallel on available idle and/or lightly loaded workstations. However, there are still many problems to be solved before a distributed system will be a real platform for parallel execution. One of the fundamental research areas which should be solved in order to both improve the overall performance of a distributed system and to allow transparent parallel execution of programs on a distributed system [Goscinski and Zhou 94] is global scheduling support. Global scheduling should be employed to allocate processes to workstations. For this purpose global scheduling should encompass both: static allocation (to balance the load when the computational load does not change too often) and load balancing (to balance the load when the computational load continually fluctuates). Load balancing requires process migration to move process away from heavily loaded workstations. Process migration is a very complicated operation which is affected by several factors. When migrating a process, the way in which each of the following issues are resolved affects the performance and behaviour of process migration [Goscinski 91]: When a process is suspended relative to finding a destination for it; How the address space is transferred from the source to the destination; When a processes incoming communication reception is suspended; How a process is communicated with after it has been migrated. Each of these decisions have many solutions. These solutions have been used by previous process migration systems. However, from the results obtained, it is not apparent which of these solutions are superior. Therefore, in RHODOS, we wish to study all the possible solutions, and ascertain the best solution to use under a given set of conditions, and thus adapt to achieve the best performance [De Paoli 93]. The goal of this paper is to demonstrate that a new design of a process migration facility [De Paoli 93] and its implementation is sound and can support both load balancing and parallel execution. Furthermore, we demonstrate here that our original design and basic implementation compare favourably with other current distributed operating systems with process migration support. In the next stage of our project we will demonstrate that it is possible to further improve the performance of the RHODOS Migration facility, thus allowing the improvement of the overall distributed system by balancing load and employing parallel execution on a distributed system. This report is structured as follows: Section 2 describes the design issues we felt important to process migration. Section 3 briefly indicates what some of the strategies are and those that have been utilised in this initial implementation. Section 4 reports on the research 2

3 into the implementation of the RHODOS process migration facility. Section 5 show process migration in well-known distributed operating systems such as Amoeba, Chorus, Mach, Mosix and Sprite. Section 6 summarises process migration performance for RHODOS and the described systems. Finally, Section 7 concludes this report, and indicates the direction of our future work. 2 Design Issues Our initial research [Goscinski 91] and an analysis of several process migration systems created [Artsy and Finkel. 89], [Douglis and Ousterhout 87], [Douglis and Ousterhout 91], [Jul et al. 88], [Powell and Miller 83], [Smith 88], [Zayas 87] show that all use differing designs. In order to achieve the goal specified in the introduction we propose that the following issues should be considered when designing a process migration facility. (i) The overall performance issues: High Performance A process migration facility should be designed and implemented to reactivate a process in the shortest possible time. Efficiency Because migrating a process consumes resources on an already over loaded computer, they should be released in the shortest possible time. Residual Dependence Execution dependency is when a migrant process depends on its source computer to execute. Such dependencies degrade the migrant process performance, and increase the chance that node failure will terminate the process. However, these dependencies are generally created to decrease the time the migrant process is frozen. Futhermore, communication dependency, which is when data is left on the source computer for communication with the migrant process. Note that there is trade off between the use of residual dependencies for performance, and the avoidance of residual dependencies to increase reliability. Multiple Migration A process should be capable of migrating many times without affecting the process execution. Team Migration It consists of migrating multiple processes from a source computer to a destination computer at one time. Team migration should be supported to increase the level of concurrency of the migration operation, however it means that the migration mechanism becomes more complicated. (ii) Software engineering and reliability issues: Policy-Mechanism Separation The policy of when to migrate which process to where should be separate from the actual mechanism to migrate a process. Mechanism-Mechanism Separation The process migration mechanism should be designed so as not to affect other operating system functions, e. g., memory management and interprocess communication. Furthermore, the internal mechanism of process migration should not be dependant upon each other. Transparency The migrating process should see the same execution environment before and after migration. Not only should a process not notice being migrated, communications with the migrated processes should be transparent as well. Reliability The process migration system should be capable of withstanding either machine or communication failures. If any error occurs, the effect should be as if the process were never migrated at all, or in the worst case as if the migrant process had terminated due to machine failure. 3

4 3 Research into the Design of the RHODOS Migration Facility To discuss how the aforementioned design issues are reflected in RHODOS, a brief introduction of the RHODOS system is required. 3.1 The RHODOS System RHODOS is a microkernel based operating system that uses the client server paradigm [[De Paoli, et al 95]. The microkernel (Nucleus) performs the basic functions of interrupt handling, local interprocess communication, context switching, and page handling. The policies that govern these functions are embodied in kernel servers with the support of system servers to provide the full functionality of a traditional operating system. The difference between kernel servers and system servers is that kernel servers have the privilege to alter kernel data. The following entities should be defined to describe process migration in RHODOS: A process consists of one single thread of execution, communication ports, an address space, and associated resource specific state, e.g., which files are open, which screen is being used, etc. This resource specific information is stored with the appropriate server/agent, e.g., for open files, the file agent stores the information. A process is the only active abstraction in the RHODOS microkernel; Ports are communication end-points accessible through the send and receive primitives [Joyce, et al 95]; Spaces [Hobbs, et al. 95] are the RHODOS mapping between virtual memory and physical store. 3.2 The Placement of the Process Migration Facility in RHODOS There are two places where a migration facility can be placed in an operating system, in user space or inside the kernel. Placing the migration facility in the kernel yields a faster migration facility. However, in some systems it is difficult to alter the kernel (and altering the kernel causes newer versions to be incompatible with older versions). Hence, placing the migration system into user space allows a simpler more flexible system to be built. This assumes that the migration facility is not already a part of the operating systems design. User Processes System Servers Global Scheduler Kernel Servers Migration IPC Space Process Network µ Kernel Figure 1 Placement of the RHODOS migration facility 4

5 Since the inception of the RHODOS concept in 1987 [Gerrity et al. 91], process migration has been considered a fundamental design issue and has been factored into all design and implementation of the operating system. Thus, for efficiency, the process migration mechanism is a part of several kernel servers (Figure 1). Namely, the Migration is the controlling entity, the Process manages the process state transfer, the Space manages the address space transfer and the InterProcess Communication manages the communications transfer. By placing RHODOS migration facility within the RHODOS kernel servers, we fulfil our earlier design goals. Namely, an in kernel implementation yields: high performance, and efficiency. With RHODOS microkernel design, policy - mechanism separation is achieved by placing the mechanisms in kernel, and the policy can be dictated from the Global Scheduler. Finally, mechanism-mechanism separation is achieved, by having clearly defined scope and roles for each kernel server to perform in process migration that are independent of each other. 3.3 Multiple Strategy Process Migration In order to (i) test the feasibility of the RHODOS microkernel based approach, where a process migration facility is one of the kernel servers; and (ii) to carry out initial performance measurements to find whether migration can support parallel execution on a distributed operating system, we have addressed the following issues which affect the performance and behaviour of process migration [Goscinski 91]: How the address space is transferred from the source to the destination; When a processes incoming communication reception is suspended; How a process is communicated with after it has been migrated. 3.4 Address Space Migration Address space migration consists of transferring the contents of the source computers memory to the destination computers memory. There are four methods in which this task can be achieved [Zhu, et al. 90]: direct copy; copy dirty pages to the destination; copy dirty pages to shared disk; lazy shipment. In the current basic version of the Migration we utilise copy dirty pages to the destination. This method only copies the pages that have been dirtied across to the destination, thus saving time and resources (Figure 2). However, if in the future some of the dirty pages are not used again, then there has been redundant copying of the pages. SOURCE Address Space DESTINATION Address Space Figure 2 Direct Copy of the Dirty Pages 5

6 3.5 Communication Migration The first issue with communication migration is at what point do we stop message reception. Secondly, once the communication reception has been suspended how do we communicate with the migrated process. There are three moments when the incoming communication handling can be suspended: when the process is suspended; when the communication state has been migrated; and never. In the current version we suspend the communication reception when the process is suspended. With this method (also called synchronous suspension), both the process and communications are suspended at the same time, as shown in Figure 3. Unfortunately with this method, whilst the existing communication and the address space are being migrated, later incoming communication must be rejected and the sender must retransmit these messages. Most importantly, address space migration generally takes longer than communication migration. This is due to the fact that an entire address space is quite large compared to a message (or several messages). Thus, whilst the address space is being transferred, the communication migration will be completed. Then the IPC manager performs no functions for the process for a substantial period of time. Note that rarely these messages could be numerous and/or sizeable, thus it may be costly to reject incoming messages. Once again there is a threshold value (or size) of messages that are coming in that will decide whether or not incoming communication is accepted or rejected. Within RHODOS, the network service can handle the rejection of these messages, hence the migration manager can decide on the fly to accept or reject the incoming communication. Process and Communication Suspension Resume Execution Figure 3. Synchronous Communication Suspension After a process has been migrated, there must be some method of maintaining communication with it. The simplest method is to broadcast the result of a migration, so that each computer in the system knows where a migrant process is. However, this is inefficient especially when the size of the system is large. The hint fault method uses a hint to point to the location of a process. If a send to the process fails then that process must have been migrated. When the send fails a fault occurs and the hint must be updated, generally by broadcasting to find the process. An extension to this is to keep the location of each process that migrates with the computer where the process was created, or the origin computer. Thus, when a fault occurs, a message is sent to the origin computer to determine the current location of the migrant process. This is more efficient than broadcasting to locate the process. Thus, in RHO- DOS, we store the current location of each migrant process with its origin computer. 6

7 3.6 Avoiding Residual Dependency and Providing Multiple, Transparent Migration In Section 3.2, we emphasise how our design fulfilled our goals of: high performance, efficiency, policy - mechanism separation and mechanism - mechanism separation. This leaves several design goals that we wished to be met, that have not been dealt with explicitly. Residual dependencies are kept to a minimum in RHODOS, by having the origin of the migrant process maintain its location in the system (to aid in communication to the migrant process). However, other than the small amount of data (process and current location), no other data is kept at the origin and no data is kept at any computer that has at some time migrated the process. Thus, at any given time a process will be known by both the computer the process was created on and the computer the process is actually running on (which may be the same). Note that this is not the case if the address space is transferred by the copy on reference technique. Multiple migration has been implemented to ensure effective load balancing, e.g., when a user logs into his/her workstation which already has processes migrated to it, then they can be migrated again to a new destination host. Team migration is handled by migrating the two or more processes in the team with some extra information indicating the resource(s) being shared. Transparency is achieved by the virtue of the fact that RHODOS is a message passing based system, and the operating system itself maintains communication transparency. Finally, reliability in RHODOS is maintained by having the migration facility utilise a transaction based approach. 3.7 Transaction Based Process Migration In RHODOS, each process has at least two ports for communication and at least four spaces which map the text, data and stacks to physical memory [[De Paoli, et al 95]. Files in RHODOS can be memory-mapped. If a file is memory-mapped then it is linked (from disk to memory) via a space. This system allows one single mechanism to handle migration of memory mapped files, and the text, data and stack spaces. To migrate a process in RHODOS, involves migrating the: process state, address space, communication state, and any other associated resources with the co-operation of the appropriate server, e.g., files [Panadiwal and Goscinski 94]. To perform reliable process migration in RHODOS, it is necessary to do the following transaction based sequence of operations: Send a message to transfer the process identification, and which resources are about to be migrated. This message is considered the start of the migration transaction; Request the Process to ensure the process is in a fit state to migrate (on the appropriate queue, not in a system call etc., see [De Paoli 93] for more details). Then place the process on the frozen queue, then transfer process state to destination; Request appropriate server to transfer process resource details, e.g., File Server, Space, IPCM, Device. Note that when the IPCM receives this request, it will freeze the process ports; At the destination: The Migration receives the control information, and creates state reflecting that a migration is underway. As each resource is transferred, the appropriate server notifies the Migration. Once all resources have been received, the Migration sends the result to the source Migration ; Finally: 7

8 The source Migration waits until the destination Migration sends a reply with the result, then removes the redundant process. The destination then starts the migrated process. This message is considered to commit the transaction. In RHODOS the Space transfers the address space. The address space can be transferred with one of many transfer strategies (selectable for each migration without recompilation). The InterProcess Communication (IPCM) [Joyce, et al 95] in RHODOS manages a process ports and all remote messages. When a process ports are suspended, all messages (local and remote) are forwarded onto the IPCM which queues them until they can be forwarded to the process. Provision for other strategies to deal with incoming messages during freeze time have been designed. 4 The Execution and Performance of Process Migration in RHODOS The microkernel approach and the design decisions we made affect the execution of process migration and its performance. This section addresses these two issues. 4.1 Execution of Process Migration in RHODOS Figure 4 shows how process migration in RHODOS is executed. Firstly once the Migration has been notified of which process to migrate where (at time t 0 ) the Migration contacts the Process to request that the process state can be transferred to the destination computer. This request also ensures that the process is capable of being migrated at the present time (i.e., it is on the ready queue, not in a system call etc.). Once the Process ascertains that the process is a valid migration candidate, it freezes the process and then encapsulates the process state into a message and sends the message to the destination Process. Then the Process sends an acknowledgement to the Migration to inform it that the process state has been sent (and incidentally, that the process was in fact a valid candidate for migration). SOURCE DESTINATION Migration Transfer State Process Space IPC Ack Transfer Address Space Transfer Ports Acks Start Migration Process State Address Space Ports/Messages State Ack Space Ack Ports Ack Migration Process Space IPC t 0 t 1 Figure 4 First Phase of Process Migration Time The Migration then sends out requests to both the Space, and the IPC to transfer the memory and communication details of the process, respectively. 8

9 Once the Space receives the request, it will encapsulate the memory of the process into a message and send this to the destination Space. Once this has been done, the Space sends an acknowledgement to the Migration to indicate that the process memory has been sent. Concurrent to the Space s execution, the IPC will receive the request to transfer the communication details of the process. The IPC freezes the ports, encapsulates the ports and any messages on them, and transfers these to the destination IPC, then an acknowledgement is sent from to the Migration to indicate that the process communication details have been migrated. Once all these acknowledgements have been accepted (at time t 1 ) the source Migration knows that the process has been migrated to the destination. Note that as the Space and IPCM are contacted concurrently, and due to the size of the process address space and communication details, the acknowledgements they send to the Migration can be received in any order. DESTINATION SOURCE Migration Process Space Activate Process Activate Spaces Activate Ports Commit Migration Free State Free Spaces Free Ports Migration Process Space IPC IPC t 1 t 2 Figure 5 Second Phase of Process Migration Time At time t 1 there exists a copy of the migrant process on both source and destination. Thus, the Migration s on both computers must confer and commit the migration (or cancel it if problems have occurred), this is shown in Figure 5. The destination Migration (at time t 1 ) sends an acknowledgement to the source Migration (utilising a reliable send offered by RHODOS network protocol [Joyce, et al 95]). Once this message has been sent, then the migration can be committed on both source and destination. For the destination this means sending a message to the local Process, Space and IPC s to activate the process resources. Once these messages are received and processed, the process will start executing on the destination node. Once the source Migration receives the commit migration message, then it sends three local messages to the Process, Space and IPC s to free up all the migrated process resources. Upon the reception and processing of these messages the migrated process resources will be removed, and the process will no longer exist on the source computer. 9

10 4.2 Performance of Process Migration in RHODOS When processes are migrated, one can take advantage of the fact that most migrations are to hosts on the same local area network (LAN) [De Paoli and Goscinski 95a]. RHODOS process migration facility is capable of migrating within the one LAN with a reduced protocol stack, or to hosts on another separate network by using a complete protocol stack. A series of measurements were made to determine how long it took to migrate a process (from time t 0 until the commit migration message is received). In this initial implementation of migration in RHODOS (on a SUN 3/50) within the one LAN takes: * vms * p milliseconds (EQ 1) whereas, to migrate between two separate LANs takes: * vms * p milliseconds (EQ 2) where p = the number of extra ports (a process has two by default) and vms = virtual memory size (in Kbs). Empirically, the average time to migrate a 100 Kilobyte process on the same LAN was ms with a standard deviation of 0.5 ms. On two separate LANs empirically it took ms with a standard deviation of 0.3 ms to migrate a 100 Kilobyte process. 5 Related Work on Migration Facilities This section details process migration on other operating systems, namely Amoeba, Chorus, Mach, Mosix and Sprite. 5.1 Amoeba The Amoeba distributed operating system was designed as a distributed operating system of the 1990s [Tanenbaum et al. 90]. This impacts on process migration, e.g., Amoeba does not use virtual memory, hence process migration is limited to directly copying the process address space from source to destination. The implementation of process migration on the Amoeba system at the University of South Australia [Steketee, et al. 94], [Zhu, et al. 95] utilises a migration server which is run as a user mode process. The steps to migrate a process are: Upon receipt of a migration request, the source migration server invokes a get_info to obtain the migrant process capability, a set_owner to allow the migration server to stun the process and then stuns the process (pro_stun). The source process server sends a process descriptor for the migrating process to the destination migration server. This contains the state of the process. The destination migration server uses this to set up a copy of the process on the destination host, by making requests to the destination process server to set up the kernel state and to allocate memory for the process segments. The destination migration server sends a series of RPC requests to the source process server, copying the memory from source to destination. The migration is completed by the passing of an execution token for the process from the source to the destination migration server in a message exchange between the two. In this implementation (on a 40 MHz i80386 PC), to migrate a 100 Kilobyte process 10

11 takes approximately 1.5 seconds [Zhu, et al. 95]. There is no breakdown of where the time is spent, however two factors are causing process migration to be slower than would be expected given that communication speed is one of Amoeba s largest assets (throughput of over 700 Kilobytes/sec) [Tanenbaum et al. 90]. Firstly, the migration server is a user process and as such has a low priority, coupled with a timeslice of 100ms the migration server can be waiting for CPU time on a heavily loaded CPU. Secondly, the migration server enters the kernel three times (get_info, set_owner and pro_stun) each time giving up its timeslice. Note that get_info, set_owner and pro_stun can be performed from the destination migration server via RPCs to the process server on the source node, yielding faster migration times when the destination is lightly loaded [Zhu, et al. 95]. Note that in [Zhu 95], Zhu states that the implementation has been incorporated into the kernel which has improved performance, however, no performance figure has yet been given. 5.2 Chorus Chorus is an object oriented message passing based distributed operating system that utilises a microkernel. A process is Chorus is called an actor. An actor can have multiple threads of execution, and utilises ports and messages to allow processes to communicate with each other [Rozier et al. 92]. The Amadeus Project [O Connor et al. 94], follows with Chorus object oriented flavour, hence process (actor) migration on Chorus, is performed by three modules: kernel, transport and policy. The kernel module is responsible for encapsulating the state information of migrating processes and for re-establishing these processes on other nodes using this state [and] for providing load information to policy modules [Rozier et al. 92]. The policy module is the entity that (using several criteria) decides to migrate a process. Finally, the job of actually transferring the process is performed by the transport module. To migrate a process in Chorus: The policy module, after certain criteria are met (with the information coming from the kernel module) decides to migrate a process, and informs the transport module of this decision; The transport module then: requests the kernel module to encapsulate the process state; ships this state information to the destination machine (address space transfer is performed by flushing the dirty pages to the segment mapper, allowing remote paging to be used, similar to Sprite, Mosix) requests the kernel module to re-establish the process. This implementation of process migration (on top of Chorus) modified the network manager so that a migrating process ports are marked as migrating. All messages sent to a port result in a location request for that port being performed. When a process is migrating, these requests result in the requesting node being informed that the process is migrating. The requesting node reissues this request until either the port has finished migrating or a timeout occurs. The terminal driver was also rewritten to use Chorus IPC primitives. Thus, all terminal traffic is handled as part of the communication migration, instead of requiring a mechanism to sever and reattach to a terminal, whilst maintaining all terminal traffic is not lost. 5.3 Mach Mach is a message passing based distributed operating system that utilises a microker- 11

12 nel. In Mach, the unit of execution is called a task, where each task can have multiple threads of execution. The work presented in [Milojicic et al. 93] and [Milojicic 94a] detail Mach s task migration facility. The task migration facility for Mach relies on Mach NORMA (NO Remote Memory Access). NORMA provides transparent network IPC, distributed shared memory and a distributed capability space. In Mach, the task migration facility has been implemented firstly in user space (Simple Migration Server - SMS) and secondly in-kernel (Optimised Migration Server - OMS). The user space implementation is favoured due to: the in-kernel task migration does not provide much better performance over the user space implementation and because it fits the design philosophy of Mach better [Milojicic 94a], pp 61. and due to its simplicity and robustness compare to the in-kernel implementation [Milojicic 94a], pp 56. To perform task migration the SMS server: suspends the task and aborts the threads to clear the kernel state; interpose the task/threads kernel ports; transfer the address space; transfer the thread state; transfer the capabilities: NORMA does the actual port transfer; transfer the other task/thread state; interpose back the task/thread kernel ports (at the destination site); and resume the task. Communication transparency is preserved during freeze time, by the use of the interposition ports. By interposing ports with the tasks kernel ports, all messages normally sent to the migrating task are instead sent to SMS. The address space is transferred via NORMA, (or by several user space transfer strategies if OMS is used). NORMA also handles the capability migration. Inherent in Mach s task migration are two points to consider: Using NORMA to transparently handle address space migration (which utilises Copy-On-Reference) means there is a severe executional residual dependency. Mach utilises a personality emulator (e.g. Unix), so that a task passes personality specific data to the emulator. Thus, migrated processes also have an extra executional residual dependency. 5.4 Mosix Mosix is a distributed operating system based on Unix. The kernel has been split into three layers, machine dependent and independent layers and a linker between them. This formation allows user processes to run in a site-independent manner. Processes in Mosix have the same structure as in UNIX, however, as Mosix supports migration, there are a few implications. Areas affected include: remote paging, locating other processes and interprocess communication [Barak et al. 93]. The changes to these areas include: using a site-independent reference to pages; extra fields and flags in the process table to allow load balancing and migration; and utilising a home node structure to find migrated processes. To migrate a process in Mosix: The source node allocates a process frame on the remote node and initialises the process memory regions, passing along the process u area; Dirty pages are transferred to the destination (other pages are demand-paged from the process executable file); 12

13 Finally a commit message is sent, which sets the remaining state of the process and restarts the process. 5.5 Sprite Sprite is a kernel (as opposed to micro-kernel) based system. Sprite s interface appears like that of Unix, however, the implementation of its kernel is completely different [Douglis and Ousterhout 87]. Migrating a process in Sprite is carried out in the following way [Douglis and Ousterhout 91]: An RPC is sent to the destination to confirm that migration is allowed; The process is interrupted; The state of the process is transferred; The virtual address space is transferred by flushing any dirty pages to a shared file server; File descriptors and the current working directory are transferred; An RPC is sent to conclude the migration. Once this is done, the process can run on its new destination and demand pages in memory as requested. 6 Performance Summary The design of the RHODOS migration facility is a novel solution - it has been a part of the design of the whole distributed operating system right from the beginning and as such is an integral part of the design. Most projects create an operating system then add process migration on rather than integrate its requirements into their initial design. The design of the RHO- DOS migration facility follows the client-server paradigm. Thus, this facility is a separate entity from other kernel servers, such as the: Process, InterProcess Communication and Space. For these reasons it is necessary to carry out initial performance comparisons with other current distributed operating systems with process migration support to identify our approach s soundness and feasibility. The following section details the type of process environment on each system, and the steps the migration facilities take. 94]: In Chorus, to migrate a process (on a network of Micro Vax-IIs) takes [O Connor et al *α + 120*β milliseconds where α equals the number of kilobytes to be flushed at the source node of the migration, and β is the number of these pages later referenced by the actor. To migrate a task in the Mach operating system (on a 33 MHz i80486 PC) takes [Milojicic 94a]: SMS time *n *rr + 5.5*sr + 5.5*sor + 58*t milliseconds OMS time *n + 7.9*rr + 1.9*sr + 1.1*sor + 5.4*t milliseconds where rr is the number of receive capabilities, sr is the number of send capabilities, sor is the number of send-once capabilities, n is the number of regions, and t is the number of threads [Milojicic 94b]. To migrate a process in MOS [Barak and Shiloh 85] (on PDP-11s connected by a 10 Mbit ring) takes approximately: 5.4 ms / Kilobyte. 13

14 In Sprite to migrate a process (on a Sun 3) takes [Douglis and Ousterhout 91]: *f *fs *vms milliseconds where f is the number of open files, fs is the file size in kilobytes, and vms is the virtual memory size in kilobytes. The performance results of Amoeba, Chorus, Mach, Mosix, RHODOS and Sprite are summarised in Table 6.1. The dominant factor in process migration performance is how the address space is of the migrant process is transferred, and as such is listed in the table. For each system, the network hardware is on 10 Mbit Ethernet, whilst the computer hardware is given. A breakdown of the time it takes to migrate a process is listed (where available). The time a process will spend frozen (from the start of the migration to the first instant when the process continues execution on the new host) is recorded in the column entitled freeze time for 100K process. If either flush to disk or copy on reference is used, then as the process executes on the destination host, it must retrieve the pages of its address space as required. Thus, such processes again wait (are frozen) for each page to be retrieved. Thus, finally the time spent frozen if the migrated process touches all of its memory once it has been migrated are listed in the table in the column entitled freeze time for worst case (100K). Note that direct comparison between the times mentioned is inappropriate. Table 6.1 Comparison of migration for different systems Distributed Operating System how address space is copied hardware i time to migrate (in milliseconds) freeze time for 100K process freeze time for worst case (100K) Amoeba direct copy 386 PC N/A 1.5 sec 1.5 sec Chorus flushed to disk Micro Vax-IIs Mach copy on reference *v + 120*β ms 486 PC *n *rr + 5.5*sr + 5.5*sor + 58*t 1 s 12.2 sec 500 ms 881ms ii Mosix iii direct copy PDP * v 540 ms 540 ms RHODOS direct copy Sun 3/ *p *v *p *v Sprite iii flushed to disk Sun 3/ *f *fs *v ms ms ms ms 285 ms 480 ms iv i. All hardware is connected by a 10 Mbit Ethernet LAN. ii. Based on paging time of ms [Milojicic 94a], pp76. iii. more recent results are published, however on different network hardware to the other systems. iv. Based on paging time of 15 ms [Douglis and Ousterhout 87]. key: v = virtual memory (in Kb), β = the number of (512 byte) pages later referenced, n = number of regions, rr = receive capabilities, sr = send capabilities, sor = send once capabilities, t = threads, p = number of ports, f = number of open files, fs = file size (in Kb), and p = the number of extra ports (a RHODOS process has two by default). 14

15 7 Conclusion Process migration is considered an integral part of the RHODOS system, and as such since the beginning of the design of the whole RHODOS system, the needs and impact of process migration has been taken into account. Because of this, process migration was easy to implement and is well insulated from the rest of the kernel, unlike what has been found in earlier implementations [Theimer et al. 85], [Douglis and Ousterhout 87]. In fact even microkernel based systems such as Mach and Chorus required changes to support process migration [Milojicic 94a], [O Connor et al. 94]. From the inception of RHODOS, we have incorporated process migration into its design, yielding an efficient, reliable and secure migration facility. Obviously, if RHODOS intends to increase the overall system throughput by utilising load balancing and parallel execution, then the process migration facility must be fast. Even in this early implementation, our design compares favourably with other current systems. This justifies out initial attempts, and spurs us on to complete the implementation of our design. Hence, we will continue to optimise both local and remote IPC. Of the strategies to be implemented, the most important decision is how to migrate the address space. As such, we will finish implementing flush to disk, and copy on reference, then we will undertake full performance testing of each strategy to determine when to use a particular strategy. Using this information, we can build an adaptive migration facility, to perform at its best given any system load.[[de Paoli, et al 95] References [Ahuja et al. 86] S. Ahuja, N. Carriero and D. Gelernter. Linda and Friends, IEEE Computer, October. [Artsy and Finkel. 89] Y. Artsy and R. Finkel. Designing a Process Migration Facility the Charlotte Experience. IEEE Computer, September. [Bal et al. 89] H. Bal, J. Steiner and A. Tanenbaum. Programming Languages for Distributed Computing Systems, ACM Computing Surveys, September. [Barak and Shiloh 85] A. Barak and A. Shiloh. A Distributed Load-balancing Policy for a Multicomputer. Software - Practice and Experience, Vol. 15(9), September. [Barak et al. 93] A. Barak, S. Guday, R. Wheeler. The MOSIX Distributed Operating System. Load Balancing for UNIX. Springer-Verlag. [De Paoli 93] D. De Paoli. The Multiple Strategy Process Migration for RHODOS: The Logical Design. Technical Report TR C93/37, Deakin University, Australia. [De Paoli and Goscinski 95a] D. De Paoli and A. Goscinski. The Influence of Domain and Interdomain Process Migration on the Performance of Parallel Execution on Distributed Systems. Proceedings of the International Conference on Parallel and Real Time Systems, Perth Australia, September (in print). [[De Paoli, et al 95] D. De Paoli, A. Goscinski, M. Hobbs and G. Wickham. The RHODOS Microkernel, Kernel Servers and Their Cooperation. IEEE First International Conference on Algorithms And Architectures for Parallel Processing, Brisbane, April. [Douglis and Ousterhout 87] F. Douglis, J. Ousterhout. Process Migration in the Sprite Operating System. Proceedings of the 7 th international Conference on Distributed Computing Systems, Berlin, September. [Douglis and Ousterhout 91] F. Douglis, J. Ousterhout. Transparent Process Migration: Design Alternatives and the Sprite Implementation. Software-Practice and Experience, 15

16 21: [Gerrity et al. 91] G. W. Gerrity, A. Goscinski, J. Indulska, W. Toomey and W. Zhu. Can We Study Design Issues of Distributed Operating Systems in a Generalized Way? SEDMS II Symposium on Experiences with Distributed and Multiprocessor Systems, March. [Goscinski 91] A. Goscinski. Distributed Operating Systems. The Logical Design. Addison- Wesley. [Goscinski and Zhou 94] A. Goscinski, W. Zhou. Towards a Global Computer: Improving the Overall Distributed System Performance and the Computational Services Provided to Users by Employing Global Scheduling and Parallel Execution, Proposal: ARC Research Grant [Hobbs, et al. 95] M. Hobbs, G. Wickham, D. De Paoli and A. Goscinski. Memory Spaces for the RHODOS Multi-threaded Microkernel Systems. Proceedings of the International Conference on Automation, Indore, India, December 1995 (in print). [Joyce, et al 95] P. Joyce, D. De Paoli, A. Goscinski, and M. Hobbs. Implementation and Performance of the Interprocess Communications Facility in RHODOS. Proceedings of the International Conference on Networks, Singapore, October. [Jul et al. 88] E. Jul, H. Levy, N. Hutchinson and A. Black. Fine-grained Mobility in the Emerald System. ACM Transactions on Computer Systems, 6(1): , February. [Milojicic et al. 93] D. Milojicic, W. Zint, A. Dangel, and P. Giese. Task Migration on the top of the Mach Microkernel. Proceedings of the third USENIX Mach Symposium, April. [Milojicic 94a] D. Milojicic. Load Distribution. Implementation for the Mach Microkernel. Vieweg. [Milojicic 94b] D. Milojicic. Private Communication. October [O Connor et al. 94] M. O Connor, B. Tangney, V. Cahill and N. Harris. Micro-kernel Support for Migration. Technical Report TCD-CS Trinity College, Dublin, Ireland. [Panadiwal and Goscinski 94] R. Panadiwal and A. Goscinski. A highly recoverable, reliable and efficient transaction oriented file service for distributed environment. Proceedings of IEEE Region 10 s Ninth Annual International Conference on: Frontiers of Computer Technology, Singapore, August. [Powell and Miller 83] M. Powell and B. Miller. Process Migration in DEMOS/MP. ACM Operating System Review, October. [Rozier et al. 92] M. Rozier, V. Abrossimov, F. Armand, M. Gien, M. Guillemont, F. Hermann and C. Kaiser. Chorus.(Overview of the Chorus Distributed Operating System). USENIX Workshop on Micro-Kernels and Other Kernel Architectures, April. [Steketee, et al. 94] C. Steketee, W. Zhu and P. Moseley. Implementation of Process Migration in Amoeba. Proceedings of the 14th Conference on Distributed Computing Systems, Poland, June. [Smith 88] J. Smith. A Survey of Process Migration Mechanisms. ACM Operating Systems Review, 22(3):28-40, July. [Tanenbaum et al. 90] A. Tanenbaum, R. van Renesse, H. van Staveren, G. Sharp, S. Mullender, J. Jansen and G. van Rossum. Experiences with the Amoeba Distributed Operating System. Communications of the ACM Volume 33, No. 12, December. [Theimer et al. 85] M. Theimer, K. Lantz, and D. Cheriton. Preemptable Remote Execution Facilities for the V-System. Proceedings of 10 th ACM Symposium on Operating Systems 16

17 Principles, December. [Trottenberg 93] U. Trottenberg Are Multiworkstations Replacing Supercomputers or Massively Parallel Systems? Questions, Facts and a Result. GDM D Spiegel The Journal of the German National Research Center for Computer Science (GMD). [Zayas 87] E. Zayas. Attacking the Process Migration Bottleneck. Proceedings of the 11 th ACM Symposium on Operating Systems Principles, November. [Zhu, et al. 90] W. Zhu, A. Goscinski and G.W. Gerrity. Process Migration in RHODOS. Technical Report CS90/9, UNSW, Australia, March. [Zhu, et al. 95] W. Zhu, C. Steketee and B. Muilwijk. Load Balancing and Workstation Autonomy on Amoeba. Australian Computer Science Communications, Vol. 17, No. 1, February. [Zhu 95] W. Zhu. Personal Communication, June. 17

An Efficient Live Process Migration Approach for High Performance Cluster Computing Systems

An Efficient Live Process Migration Approach for High Performance Cluster Computing Systems Ehsan Mousavi Khaneghah, Najmeh Osouli Nezhad, Seyedeh Leili Mirtaheri, Mohsen Sharifi, and Ashakan Shirpour