The RHODOS Migration Facility 1

Size: px
Start display at page:

Download "The RHODOS Migration Facility 1"

Transcription

1 The RHODOS Migration Facility 1 Damien De Paoli and Andrzej Goscinski (ddp@deakin.edu.au, ang@deakin.edu.au) School of Computing and Mathematics Deakin University Geelong, Victoria 3217, Australia Abstract This paper looks at the design, implemntation and performance of RHODOS process migration facility which reflects RHODOS aspiration to support both load balancing and parallel execution on a distributed system. Following the presentation of the design issues, which reflect the requirements of both load balancing and parallel execution on a distributed system, research into an initial implementation and testing of RHODOS process migration facility will be presented. Performance measurements of this initial implementation have also been taken and the results are reported in this paper. A comparison with several modern distributed operating systems show that already our migration facility compares favourably. 1. This work was partly supported by the Australian Research Council Grant A and the Deakin University Research Grant

2 1 Introduction In a distributed system there are many workstations which are either idle or lightly loaded, while others are heavily loaded. Users working at heavily loaded workstations are not provided with a satisfactory service. In particular, their response time can be very long. The overall system performance and quality of service provided to an individual user can be improved if idle or lightly loaded workstations are used more efficiently and the computational load is transparently balanced [Goscinski 91]. A distributed system has only recently been identified as a viable platform for parallel execution of user programs instead of parallel computers [Ahuja et al. 86], [Bal et al. 89], [Trottenberg 93]. These can be achieved by using idle and/or lightly loaded workstations to run parallel sub tasks of a program. In other words, the overall performance of a distributed system and the performance of individual users (measured in response time and/or throughput) can be further improved if a user program will be executed in parallel on available idle and/or lightly loaded workstations. However, there are still many problems to be solved before a distributed system will be a real platform for parallel execution. One of the fundamental research areas which should be solved in order to both improve the overall performance of a distributed system and to allow transparent parallel execution of programs on a distributed system [Goscinski and Zhou 94] is global scheduling support. Global scheduling should be employed to allocate processes to workstations. For this purpose global scheduling should encompass both: static allocation (to balance the load when the computational load does not change too often) and load balancing (to balance the load when the computational load continually fluctuates). Load balancing requires process migration to move process away from heavily loaded workstations. Process migration is a very complicated operation which is affected by several factors. When migrating a process, the way in which each of the following issues are resolved affects the performance and behaviour of process migration [Goscinski 91]: When a process is suspended relative to finding a destination for it; How the address space is transferred from the source to the destination; When a processes incoming communication reception is suspended; How a process is communicated with after it has been migrated. Each of these decisions have many solutions. These solutions have been used by previous process migration systems. However, from the results obtained, it is not apparent which of these solutions are superior. Therefore, in RHODOS, we wish to study all the possible solutions, and ascertain the best solution to use under a given set of conditions, and thus adapt to achieve the best performance [De Paoli 93]. The goal of this paper is to demonstrate that a new design of a process migration facility [De Paoli 93] and its implementation is sound and can support both load balancing and parallel execution. Furthermore, we demonstrate here that our original design and basic implementation compare favourably with other current distributed operating systems with process migration support. In the next stage of our project we will demonstrate that it is possible to further improve the performance of the RHODOS Migration facility, thus allowing the improvement of the overall distributed system by balancing load and employing parallel execution on a distributed system. This report is structured as follows: Section 2 describes the design issues we felt important to process migration. Section 3 briefly indicates what some of the strategies are and those that have been utilised in this initial implementation. Section 4 reports on the research 2

3 into the implementation of the RHODOS process migration facility. Section 5 show process migration in well-known distributed operating systems such as Amoeba, Chorus, Mach, Mosix and Sprite. Section 6 summarises process migration performance for RHODOS and the described systems. Finally, Section 7 concludes this report, and indicates the direction of our future work. 2 Design Issues Our initial research [Goscinski 91] and an analysis of several process migration systems created [Artsy and Finkel. 89], [Douglis and Ousterhout 87], [Douglis and Ousterhout 91], [Jul et al. 88], [Powell and Miller 83], [Smith 88], [Zayas 87] show that all use differing designs. In order to achieve the goal specified in the introduction we propose that the following issues should be considered when designing a process migration facility. (i) The overall performance issues: High Performance A process migration facility should be designed and implemented to reactivate a process in the shortest possible time. Efficiency Because migrating a process consumes resources on an already over loaded computer, they should be released in the shortest possible time. Residual Dependence Execution dependency is when a migrant process depends on its source computer to execute. Such dependencies degrade the migrant process performance, and increase the chance that node failure will terminate the process. However, these dependencies are generally created to decrease the time the migrant process is frozen. Futhermore, communication dependency, which is when data is left on the source computer for communication with the migrant process. Note that there is trade off between the use of residual dependencies for performance, and the avoidance of residual dependencies to increase reliability. Multiple Migration A process should be capable of migrating many times without affecting the process execution. Team Migration It consists of migrating multiple processes from a source computer to a destination computer at one time. Team migration should be supported to increase the level of concurrency of the migration operation, however it means that the migration mechanism becomes more complicated. (ii) Software engineering and reliability issues: Policy-Mechanism Separation The policy of when to migrate which process to where should be separate from the actual mechanism to migrate a process. Mechanism-Mechanism Separation The process migration mechanism should be designed so as not to affect other operating system functions, e. g., memory management and interprocess communication. Furthermore, the internal mechanism of process migration should not be dependant upon each other. Transparency The migrating process should see the same execution environment before and after migration. Not only should a process not notice being migrated, communications with the migrated processes should be transparent as well. Reliability The process migration system should be capable of withstanding either machine or communication failures. If any error occurs, the effect should be as if the process were never migrated at all, or in the worst case as if the migrant process had terminated due to machine failure. 3

4 3 Research into the Design of the RHODOS Migration Facility To discuss how the aforementioned design issues are reflected in RHODOS, a brief introduction of the RHODOS system is required. 3.1 The RHODOS System RHODOS is a microkernel based operating system that uses the client server paradigm [[De Paoli, et al 95]. The microkernel (Nucleus) performs the basic functions of interrupt handling, local interprocess communication, context switching, and page handling. The policies that govern these functions are embodied in kernel servers with the support of system servers to provide the full functionality of a traditional operating system. The difference between kernel servers and system servers is that kernel servers have the privilege to alter kernel data. The following entities should be defined to describe process migration in RHODOS: A process consists of one single thread of execution, communication ports, an address space, and associated resource specific state, e.g., which files are open, which screen is being used, etc. This resource specific information is stored with the appropriate server/agent, e.g., for open files, the file agent stores the information. A process is the only active abstraction in the RHODOS microkernel; Ports are communication end-points accessible through the send and receive primitives [Joyce, et al 95]; Spaces [Hobbs, et al. 95] are the RHODOS mapping between virtual memory and physical store. 3.2 The Placement of the Process Migration Facility in RHODOS There are two places where a migration facility can be placed in an operating system, in user space or inside the kernel. Placing the migration facility in the kernel yields a faster migration facility. However, in some systems it is difficult to alter the kernel (and altering the kernel causes newer versions to be incompatible with older versions). Hence, placing the migration system into user space allows a simpler more flexible system to be built. This assumes that the migration facility is not already a part of the operating systems design. User Processes System Servers Global Scheduler Kernel Servers Migration IPC Space Process Network µ Kernel Figure 1 Placement of the RHODOS migration facility 4

5 Since the inception of the RHODOS concept in 1987 [Gerrity et al. 91], process migration has been considered a fundamental design issue and has been factored into all design and implementation of the operating system. Thus, for efficiency, the process migration mechanism is a part of several kernel servers (Figure 1). Namely, the Migration is the controlling entity, the Process manages the process state transfer, the Space manages the address space transfer and the InterProcess Communication manages the communications transfer. By placing RHODOS migration facility within the RHODOS kernel servers, we fulfil our earlier design goals. Namely, an in kernel implementation yields: high performance, and efficiency. With RHODOS microkernel design, policy - mechanism separation is achieved by placing the mechanisms in kernel, and the policy can be dictated from the Global Scheduler. Finally, mechanism-mechanism separation is achieved, by having clearly defined scope and roles for each kernel server to perform in process migration that are independent of each other. 3.3 Multiple Strategy Process Migration In order to (i) test the feasibility of the RHODOS microkernel based approach, where a process migration facility is one of the kernel servers; and (ii) to carry out initial performance measurements to find whether migration can support parallel execution on a distributed operating system, we have addressed the following issues which affect the performance and behaviour of process migration [Goscinski 91]: How the address space is transferred from the source to the destination; When a processes incoming communication reception is suspended; How a process is communicated with after it has been migrated. 3.4 Address Space Migration Address space migration consists of transferring the contents of the source computers memory to the destination computers memory. There are four methods in which this task can be achieved [Zhu, et al. 90]: direct copy; copy dirty pages to the destination; copy dirty pages to shared disk; lazy shipment. In the current basic version of the Migration we utilise copy dirty pages to the destination. This method only copies the pages that have been dirtied across to the destination, thus saving time and resources (Figure 2). However, if in the future some of the dirty pages are not used again, then there has been redundant copying of the pages. SOURCE Address Space DESTINATION Address Space Figure 2 Direct Copy of the Dirty Pages 5

6 3.5 Communication Migration The first issue with communication migration is at what point do we stop message reception. Secondly, once the communication reception has been suspended how do we communicate with the migrated process. There are three moments when the incoming communication handling can be suspended: when the process is suspended; when the communication state has been migrated; and never. In the current version we suspend the communication reception when the process is suspended. With this method (also called synchronous suspension), both the process and communications are suspended at the same time, as shown in Figure 3. Unfortunately with this method, whilst the existing communication and the address space are being migrated, later incoming communication must be rejected and the sender must retransmit these messages. Most importantly, address space migration generally takes longer than communication migration. This is due to the fact that an entire address space is quite large compared to a message (or several messages). Thus, whilst the address space is being transferred, the communication migration will be completed. Then the IPC manager performs no functions for the process for a substantial period of time. Note that rarely these messages could be numerous and/or sizeable, thus it may be costly to reject incoming messages. Once again there is a threshold value (or size) of messages that are coming in that will decide whether or not incoming communication is accepted or rejected. Within RHODOS, the network service can handle the rejection of these messages, hence the migration manager can decide on the fly to accept or reject the incoming communication. Process and Communication Suspension Resume Execution Figure 3. Synchronous Communication Suspension After a process has been migrated, there must be some method of maintaining communication with it. The simplest method is to broadcast the result of a migration, so that each computer in the system knows where a migrant process is. However, this is inefficient especially when the size of the system is large. The hint fault method uses a hint to point to the location of a process. If a send to the process fails then that process must have been migrated. When the send fails a fault occurs and the hint must be updated, generally by broadcasting to find the process. An extension to this is to keep the location of each process that migrates with the computer where the process was created, or the origin computer. Thus, when a fault occurs, a message is sent to the origin computer to determine the current location of the migrant process. This is more efficient than broadcasting to locate the process. Thus, in RHO- DOS, we store the current location of each migrant process with its origin computer. 6

7 3.6 Avoiding Residual Dependency and Providing Multiple, Transparent Migration In Section 3.2, we emphasise how our design fulfilled our goals of: high performance, efficiency, policy - mechanism separation and mechanism - mechanism separation. This leaves several design goals that we wished to be met, that have not been dealt with explicitly. Residual dependencies are kept to a minimum in RHODOS, by having the origin of the migrant process maintain its location in the system (to aid in communication to the migrant process). However, other than the small amount of data (process and current location), no other data is kept at the origin and no data is kept at any computer that has at some time migrated the process. Thus, at any given time a process will be known by both the computer the process was created on and the computer the process is actually running on (which may be the same). Note that this is not the case if the address space is transferred by the copy on reference technique. Multiple migration has been implemented to ensure effective load balancing, e.g., when a user logs into his/her workstation which already has processes migrated to it, then they can be migrated again to a new destination host. Team migration is handled by migrating the two or more processes in the team with some extra information indicating the resource(s) being shared. Transparency is achieved by the virtue of the fact that RHODOS is a message passing based system, and the operating system itself maintains communication transparency. Finally, reliability in RHODOS is maintained by having the migration facility utilise a transaction based approach. 3.7 Transaction Based Process Migration In RHODOS, each process has at least two ports for communication and at least four spaces which map the text, data and stacks to physical memory [[De Paoli, et al 95]. Files in RHODOS can be memory-mapped. If a file is memory-mapped then it is linked (from disk to memory) via a space. This system allows one single mechanism to handle migration of memory mapped files, and the text, data and stack spaces. To migrate a process in RHODOS, involves migrating the: process state, address space, communication state, and any other associated resources with the co-operation of the appropriate server, e.g., files [Panadiwal and Goscinski 94]. To perform reliable process migration in RHODOS, it is necessary to do the following transaction based sequence of operations: Send a message to transfer the process identification, and which resources are about to be migrated. This message is considered the start of the migration transaction; Request the Process to ensure the process is in a fit state to migrate (on the appropriate queue, not in a system call etc., see [De Paoli 93] for more details). Then place the process on the frozen queue, then transfer process state to destination; Request appropriate server to transfer process resource details, e.g., File Server, Space, IPCM, Device. Note that when the IPCM receives this request, it will freeze the process ports; At the destination: The Migration receives the control information, and creates state reflecting that a migration is underway. As each resource is transferred, the appropriate server notifies the Migration. Once all resources have been received, the Migration sends the result to the source Migration ; Finally: 7

8 The source Migration waits until the destination Migration sends a reply with the result, then removes the redundant process. The destination then starts the migrated process. This message is considered to commit the transaction. In RHODOS the Space transfers the address space. The address space can be transferred with one of many transfer strategies (selectable for each migration without recompilation). The InterProcess Communication (IPCM) [Joyce, et al 95] in RHODOS manages a process ports and all remote messages. When a process ports are suspended, all messages (local and remote) are forwarded onto the IPCM which queues them until they can be forwarded to the process. Provision for other strategies to deal with incoming messages during freeze time have been designed. 4 The Execution and Performance of Process Migration in RHODOS The microkernel approach and the design decisions we made affect the execution of process migration and its performance. This section addresses these two issues. 4.1 Execution of Process Migration in RHODOS Figure 4 shows how process migration in RHODOS is executed. Firstly once the Migration has been notified of which process to migrate where (at time t 0 ) the Migration contacts the Process to request that the process state can be transferred to the destination computer. This request also ensures that the process is capable of being migrated at the present time (i.e., it is on the ready queue, not in a system call etc.). Once the Process ascertains that the process is a valid migration candidate, it freezes the process and then encapsulates the process state into a message and sends the message to the destination Process. Then the Process sends an acknowledgement to the Migration to inform it that the process state has been sent (and incidentally, that the process was in fact a valid candidate for migration). SOURCE DESTINATION Migration Transfer State Process Space IPC Ack Transfer Address Space Transfer Ports Acks Start Migration Process State Address Space Ports/Messages State Ack Space Ack Ports Ack Migration Process Space IPC t 0 t 1 Figure 4 First Phase of Process Migration Time The Migration then sends out requests to both the Space, and the IPC to transfer the memory and communication details of the process, respectively. 8

9 Once the Space receives the request, it will encapsulate the memory of the process into a message and send this to the destination Space. Once this has been done, the Space sends an acknowledgement to the Migration to indicate that the process memory has been sent. Concurrent to the Space s execution, the IPC will receive the request to transfer the communication details of the process. The IPC freezes the ports, encapsulates the ports and any messages on them, and transfers these to the destination IPC, then an acknowledgement is sent from to the Migration to indicate that the process communication details have been migrated. Once all these acknowledgements have been accepted (at time t 1 ) the source Migration knows that the process has been migrated to the destination. Note that as the Space and IPCM are contacted concurrently, and due to the size of the process address space and communication details, the acknowledgements they send to the Migration can be received in any order. DESTINATION SOURCE Migration Process Space Activate Process Activate Spaces Activate Ports Commit Migration Free State Free Spaces Free Ports Migration Process Space IPC IPC t 1 t 2 Figure 5 Second Phase of Process Migration Time At time t 1 there exists a copy of the migrant process on both source and destination. Thus, the Migration s on both computers must confer and commit the migration (or cancel it if problems have occurred), this is shown in Figure 5. The destination Migration (at time t 1 ) sends an acknowledgement to the source Migration (utilising a reliable send offered by RHODOS network protocol [Joyce, et al 95]). Once this message has been sent, then the migration can be committed on both source and destination. For the destination this means sending a message to the local Process, Space and IPC s to activate the process resources. Once these messages are received and processed, the process will start executing on the destination node. Once the source Migration receives the commit migration message, then it sends three local messages to the Process, Space and IPC s to free up all the migrated process resources. Upon the reception and processing of these messages the migrated process resources will be removed, and the process will no longer exist on the source computer. 9

10 4.2 Performance of Process Migration in RHODOS When processes are migrated, one can take advantage of the fact that most migrations are to hosts on the same local area network (LAN) [De Paoli and Goscinski 95a]. RHODOS process migration facility is capable of migrating within the one LAN with a reduced protocol stack, or to hosts on another separate network by using a complete protocol stack. A series of measurements were made to determine how long it took to migrate a process (from time t 0 until the commit migration message is received). In this initial implementation of migration in RHODOS (on a SUN 3/50) within the one LAN takes: * vms * p milliseconds (EQ 1) whereas, to migrate between two separate LANs takes: * vms * p milliseconds (EQ 2) where p = the number of extra ports (a process has two by default) and vms = virtual memory size (in Kbs). Empirically, the average time to migrate a 100 Kilobyte process on the same LAN was ms with a standard deviation of 0.5 ms. On two separate LANs empirically it took ms with a standard deviation of 0.3 ms to migrate a 100 Kilobyte process. 5 Related Work on Migration Facilities This section details process migration on other operating systems, namely Amoeba, Chorus, Mach, Mosix and Sprite. 5.1 Amoeba The Amoeba distributed operating system was designed as a distributed operating system of the 1990s [Tanenbaum et al. 90]. This impacts on process migration, e.g., Amoeba does not use virtual memory, hence process migration is limited to directly copying the process address space from source to destination. The implementation of process migration on the Amoeba system at the University of South Australia [Steketee, et al. 94], [Zhu, et al. 95] utilises a migration server which is run as a user mode process. The steps to migrate a process are: Upon receipt of a migration request, the source migration server invokes a get_info to obtain the migrant process capability, a set_owner to allow the migration server to stun the process and then stuns the process (pro_stun). The source process server sends a process descriptor for the migrating process to the destination migration server. This contains the state of the process. The destination migration server uses this to set up a copy of the process on the destination host, by making requests to the destination process server to set up the kernel state and to allocate memory for the process segments. The destination migration server sends a series of RPC requests to the source process server, copying the memory from source to destination. The migration is completed by the passing of an execution token for the process from the source to the destination migration server in a message exchange between the two. In this implementation (on a 40 MHz i80386 PC), to migrate a 100 Kilobyte process 10

11 takes approximately 1.5 seconds [Zhu, et al. 95]. There is no breakdown of where the time is spent, however two factors are causing process migration to be slower than would be expected given that communication speed is one of Amoeba s largest assets (throughput of over 700 Kilobytes/sec) [Tanenbaum et al. 90]. Firstly, the migration server is a user process and as such has a low priority, coupled with a timeslice of 100ms the migration server can be waiting for CPU time on a heavily loaded CPU. Secondly, the migration server enters the kernel three times (get_info, set_owner and pro_stun) each time giving up its timeslice. Note that get_info, set_owner and pro_stun can be performed from the destination migration server via RPCs to the process server on the source node, yielding faster migration times when the destination is lightly loaded [Zhu, et al. 95]. Note that in [Zhu 95], Zhu states that the implementation has been incorporated into the kernel which has improved performance, however, no performance figure has yet been given. 5.2 Chorus Chorus is an object oriented message passing based distributed operating system that utilises a microkernel. A process is Chorus is called an actor. An actor can have multiple threads of execution, and utilises ports and messages to allow processes to communicate with each other [Rozier et al. 92]. The Amadeus Project [O Connor et al. 94], follows with Chorus object oriented flavour, hence process (actor) migration on Chorus, is performed by three modules: kernel, transport and policy. The kernel module is responsible for encapsulating the state information of migrating processes and for re-establishing these processes on other nodes using this state [and] for providing load information to policy modules [Rozier et al. 92]. The policy module is the entity that (using several criteria) decides to migrate a process. Finally, the job of actually transferring the process is performed by the transport module. To migrate a process in Chorus: The policy module, after certain criteria are met (with the information coming from the kernel module) decides to migrate a process, and informs the transport module of this decision; The transport module then: requests the kernel module to encapsulate the process state; ships this state information to the destination machine (address space transfer is performed by flushing the dirty pages to the segment mapper, allowing remote paging to be used, similar to Sprite, Mosix) requests the kernel module to re-establish the process. This implementation of process migration (on top of Chorus) modified the network manager so that a migrating process ports are marked as migrating. All messages sent to a port result in a location request for that port being performed. When a process is migrating, these requests result in the requesting node being informed that the process is migrating. The requesting node reissues this request until either the port has finished migrating or a timeout occurs. The terminal driver was also rewritten to use Chorus IPC primitives. Thus, all terminal traffic is handled as part of the communication migration, instead of requiring a mechanism to sever and reattach to a terminal, whilst maintaining all terminal traffic is not lost. 5.3 Mach Mach is a message passing based distributed operating system that utilises a microker- 11

12 nel. In Mach, the unit of execution is called a task, where each task can have multiple threads of execution. The work presented in [Milojicic et al. 93] and [Milojicic 94a] detail Mach s task migration facility. The task migration facility for Mach relies on Mach NORMA (NO Remote Memory Access). NORMA provides transparent network IPC, distributed shared memory and a distributed capability space. In Mach, the task migration facility has been implemented firstly in user space (Simple Migration Server - SMS) and secondly in-kernel (Optimised Migration Server - OMS). The user space implementation is favoured due to: the in-kernel task migration does not provide much better performance over the user space implementation and because it fits the design philosophy of Mach better [Milojicic 94a], pp 61. and due to its simplicity and robustness compare to the in-kernel implementation [Milojicic 94a], pp 56. To perform task migration the SMS server: suspends the task and aborts the threads to clear the kernel state; interpose the task/threads kernel ports; transfer the address space; transfer the thread state; transfer the capabilities: NORMA does the actual port transfer; transfer the other task/thread state; interpose back the task/thread kernel ports (at the destination site); and resume the task. Communication transparency is preserved during freeze time, by the use of the interposition ports. By interposing ports with the tasks kernel ports, all messages normally sent to the migrating task are instead sent to SMS. The address space is transferred via NORMA, (or by several user space transfer strategies if OMS is used). NORMA also handles the capability migration. Inherent in Mach s task migration are two points to consider: Using NORMA to transparently handle address space migration (which utilises Copy-On-Reference) means there is a severe executional residual dependency. Mach utilises a personality emulator (e.g. Unix), so that a task passes personality specific data to the emulator. Thus, migrated processes also have an extra executional residual dependency. 5.4 Mosix Mosix is a distributed operating system based on Unix. The kernel has been split into three layers, machine dependent and independent layers and a linker between them. This formation allows user processes to run in a site-independent manner. Processes in Mosix have the same structure as in UNIX, however, as Mosix supports migration, there are a few implications. Areas affected include: remote paging, locating other processes and interprocess communication [Barak et al. 93]. The changes to these areas include: using a site-independent reference to pages; extra fields and flags in the process table to allow load balancing and migration; and utilising a home node structure to find migrated processes. To migrate a process in Mosix: The source node allocates a process frame on the remote node and initialises the process memory regions, passing along the process u area; Dirty pages are transferred to the destination (other pages are demand-paged from the process executable file); 12

13 Finally a commit message is sent, which sets the remaining state of the process and restarts the process. 5.5 Sprite Sprite is a kernel (as opposed to micro-kernel) based system. Sprite s interface appears like that of Unix, however, the implementation of its kernel is completely different [Douglis and Ousterhout 87]. Migrating a process in Sprite is carried out in the following way [Douglis and Ousterhout 91]: An RPC is sent to the destination to confirm that migration is allowed; The process is interrupted; The state of the process is transferred; The virtual address space is transferred by flushing any dirty pages to a shared file server; File descriptors and the current working directory are transferred; An RPC is sent to conclude the migration. Once this is done, the process can run on its new destination and demand pages in memory as requested. 6 Performance Summary The design of the RHODOS migration facility is a novel solution - it has been a part of the design of the whole distributed operating system right from the beginning and as such is an integral part of the design. Most projects create an operating system then add process migration on rather than integrate its requirements into their initial design. The design of the RHO- DOS migration facility follows the client-server paradigm. Thus, this facility is a separate entity from other kernel servers, such as the: Process, InterProcess Communication and Space. For these reasons it is necessary to carry out initial performance comparisons with other current distributed operating systems with process migration support to identify our approach s soundness and feasibility. The following section details the type of process environment on each system, and the steps the migration facilities take. 94]: In Chorus, to migrate a process (on a network of Micro Vax-IIs) takes [O Connor et al *α + 120*β milliseconds where α equals the number of kilobytes to be flushed at the source node of the migration, and β is the number of these pages later referenced by the actor. To migrate a task in the Mach operating system (on a 33 MHz i80486 PC) takes [Milojicic 94a]: SMS time *n *rr + 5.5*sr + 5.5*sor + 58*t milliseconds OMS time *n + 7.9*rr + 1.9*sr + 1.1*sor + 5.4*t milliseconds where rr is the number of receive capabilities, sr is the number of send capabilities, sor is the number of send-once capabilities, n is the number of regions, and t is the number of threads [Milojicic 94b]. To migrate a process in MOS [Barak and Shiloh 85] (on PDP-11s connected by a 10 Mbit ring) takes approximately: 5.4 ms / Kilobyte. 13

14 In Sprite to migrate a process (on a Sun 3) takes [Douglis and Ousterhout 91]: *f *fs *vms milliseconds where f is the number of open files, fs is the file size in kilobytes, and vms is the virtual memory size in kilobytes. The performance results of Amoeba, Chorus, Mach, Mosix, RHODOS and Sprite are summarised in Table 6.1. The dominant factor in process migration performance is how the address space is of the migrant process is transferred, and as such is listed in the table. For each system, the network hardware is on 10 Mbit Ethernet, whilst the computer hardware is given. A breakdown of the time it takes to migrate a process is listed (where available). The time a process will spend frozen (from the start of the migration to the first instant when the process continues execution on the new host) is recorded in the column entitled freeze time for 100K process. If either flush to disk or copy on reference is used, then as the process executes on the destination host, it must retrieve the pages of its address space as required. Thus, such processes again wait (are frozen) for each page to be retrieved. Thus, finally the time spent frozen if the migrated process touches all of its memory once it has been migrated are listed in the table in the column entitled freeze time for worst case (100K). Note that direct comparison between the times mentioned is inappropriate. Table 6.1 Comparison of migration for different systems Distributed Operating System how address space is copied hardware i time to migrate (in milliseconds) freeze time for 100K process freeze time for worst case (100K) Amoeba direct copy 386 PC N/A 1.5 sec 1.5 sec Chorus flushed to disk Micro Vax-IIs Mach copy on reference *v + 120*β ms 486 PC *n *rr + 5.5*sr + 5.5*sor + 58*t 1 s 12.2 sec 500 ms 881ms ii Mosix iii direct copy PDP * v 540 ms 540 ms RHODOS direct copy Sun 3/ *p *v *p *v Sprite iii flushed to disk Sun 3/ *f *fs *v ms ms ms ms 285 ms 480 ms iv i. All hardware is connected by a 10 Mbit Ethernet LAN. ii. Based on paging time of ms [Milojicic 94a], pp76. iii. more recent results are published, however on different network hardware to the other systems. iv. Based on paging time of 15 ms [Douglis and Ousterhout 87]. key: v = virtual memory (in Kb), β = the number of (512 byte) pages later referenced, n = number of regions, rr = receive capabilities, sr = send capabilities, sor = send once capabilities, t = threads, p = number of ports, f = number of open files, fs = file size (in Kb), and p = the number of extra ports (a RHODOS process has two by default). 14

15 7 Conclusion Process migration is considered an integral part of the RHODOS system, and as such since the beginning of the design of the whole RHODOS system, the needs and impact of process migration has been taken into account. Because of this, process migration was easy to implement and is well insulated from the rest of the kernel, unlike what has been found in earlier implementations [Theimer et al. 85], [Douglis and Ousterhout 87]. In fact even microkernel based systems such as Mach and Chorus required changes to support process migration [Milojicic 94a], [O Connor et al. 94]. From the inception of RHODOS, we have incorporated process migration into its design, yielding an efficient, reliable and secure migration facility. Obviously, if RHODOS intends to increase the overall system throughput by utilising load balancing and parallel execution, then the process migration facility must be fast. Even in this early implementation, our design compares favourably with other current systems. This justifies out initial attempts, and spurs us on to complete the implementation of our design. Hence, we will continue to optimise both local and remote IPC. Of the strategies to be implemented, the most important decision is how to migrate the address space. As such, we will finish implementing flush to disk, and copy on reference, then we will undertake full performance testing of each strategy to determine when to use a particular strategy. Using this information, we can build an adaptive migration facility, to perform at its best given any system load.[[de Paoli, et al 95] References [Ahuja et al. 86] S. Ahuja, N. Carriero and D. Gelernter. Linda and Friends, IEEE Computer, October. [Artsy and Finkel. 89] Y. Artsy and R. Finkel. Designing a Process Migration Facility the Charlotte Experience. IEEE Computer, September. [Bal et al. 89] H. Bal, J. Steiner and A. Tanenbaum. Programming Languages for Distributed Computing Systems, ACM Computing Surveys, September. [Barak and Shiloh 85] A. Barak and A. Shiloh. A Distributed Load-balancing Policy for a Multicomputer. Software - Practice and Experience, Vol. 15(9), September. [Barak et al. 93] A. Barak, S. Guday, R. Wheeler. The MOSIX Distributed Operating System. Load Balancing for UNIX. Springer-Verlag. [De Paoli 93] D. De Paoli. The Multiple Strategy Process Migration for RHODOS: The Logical Design. Technical Report TR C93/37, Deakin University, Australia. [De Paoli and Goscinski 95a] D. De Paoli and A. Goscinski. The Influence of Domain and Interdomain Process Migration on the Performance of Parallel Execution on Distributed Systems. Proceedings of the International Conference on Parallel and Real Time Systems, Perth Australia, September (in print). [[De Paoli, et al 95] D. De Paoli, A. Goscinski, M. Hobbs and G. Wickham. The RHODOS Microkernel, Kernel Servers and Their Cooperation. IEEE First International Conference on Algorithms And Architectures for Parallel Processing, Brisbane, April. [Douglis and Ousterhout 87] F. Douglis, J. Ousterhout. Process Migration in the Sprite Operating System. Proceedings of the 7 th international Conference on Distributed Computing Systems, Berlin, September. [Douglis and Ousterhout 91] F. Douglis, J. Ousterhout. Transparent Process Migration: Design Alternatives and the Sprite Implementation. Software-Practice and Experience, 15

16 21: [Gerrity et al. 91] G. W. Gerrity, A. Goscinski, J. Indulska, W. Toomey and W. Zhu. Can We Study Design Issues of Distributed Operating Systems in a Generalized Way? SEDMS II Symposium on Experiences with Distributed and Multiprocessor Systems, March. [Goscinski 91] A. Goscinski. Distributed Operating Systems. The Logical Design. Addison- Wesley. [Goscinski and Zhou 94] A. Goscinski, W. Zhou. Towards a Global Computer: Improving the Overall Distributed System Performance and the Computational Services Provided to Users by Employing Global Scheduling and Parallel Execution, Proposal: ARC Research Grant [Hobbs, et al. 95] M. Hobbs, G. Wickham, D. De Paoli and A. Goscinski. Memory Spaces for the RHODOS Multi-threaded Microkernel Systems. Proceedings of the International Conference on Automation, Indore, India, December 1995 (in print). [Joyce, et al 95] P. Joyce, D. De Paoli, A. Goscinski, and M. Hobbs. Implementation and Performance of the Interprocess Communications Facility in RHODOS. Proceedings of the International Conference on Networks, Singapore, October. [Jul et al. 88] E. Jul, H. Levy, N. Hutchinson and A. Black. Fine-grained Mobility in the Emerald System. ACM Transactions on Computer Systems, 6(1): , February. [Milojicic et al. 93] D. Milojicic, W. Zint, A. Dangel, and P. Giese. Task Migration on the top of the Mach Microkernel. Proceedings of the third USENIX Mach Symposium, April. [Milojicic 94a] D. Milojicic. Load Distribution. Implementation for the Mach Microkernel. Vieweg. [Milojicic 94b] D. Milojicic. Private Communication. October [O Connor et al. 94] M. O Connor, B. Tangney, V. Cahill and N. Harris. Micro-kernel Support for Migration. Technical Report TCD-CS Trinity College, Dublin, Ireland. [Panadiwal and Goscinski 94] R. Panadiwal and A. Goscinski. A highly recoverable, reliable and efficient transaction oriented file service for distributed environment. Proceedings of IEEE Region 10 s Ninth Annual International Conference on: Frontiers of Computer Technology, Singapore, August. [Powell and Miller 83] M. Powell and B. Miller. Process Migration in DEMOS/MP. ACM Operating System Review, October. [Rozier et al. 92] M. Rozier, V. Abrossimov, F. Armand, M. Gien, M. Guillemont, F. Hermann and C. Kaiser. Chorus.(Overview of the Chorus Distributed Operating System). USENIX Workshop on Micro-Kernels and Other Kernel Architectures, April. [Steketee, et al. 94] C. Steketee, W. Zhu and P. Moseley. Implementation of Process Migration in Amoeba. Proceedings of the 14th Conference on Distributed Computing Systems, Poland, June. [Smith 88] J. Smith. A Survey of Process Migration Mechanisms. ACM Operating Systems Review, 22(3):28-40, July. [Tanenbaum et al. 90] A. Tanenbaum, R. van Renesse, H. van Staveren, G. Sharp, S. Mullender, J. Jansen and G. van Rossum. Experiences with the Amoeba Distributed Operating System. Communications of the ACM Volume 33, No. 12, December. [Theimer et al. 85] M. Theimer, K. Lantz, and D. Cheriton. Preemptable Remote Execution Facilities for the V-System. Proceedings of 10 th ACM Symposium on Operating Systems 16

17 Principles, December. [Trottenberg 93] U. Trottenberg Are Multiworkstations Replacing Supercomputers or Massively Parallel Systems? Questions, Facts and a Result. GDM D Spiegel The Journal of the German National Research Center for Computer Science (GMD). [Zayas 87] E. Zayas. Attacking the Process Migration Bottleneck. Proceedings of the 11 th ACM Symposium on Operating Systems Principles, November. [Zhu, et al. 90] W. Zhu, A. Goscinski and G.W. Gerrity. Process Migration in RHODOS. Technical Report CS90/9, UNSW, Australia, March. [Zhu, et al. 95] W. Zhu, C. Steketee and B. Muilwijk. Load Balancing and Workstation Autonomy on Amoeba. Australian Computer Science Communications, Vol. 17, No. 1, February. [Zhu 95] W. Zhu. Personal Communication, June. 17

An Efficient Live Process Migration Approach for High Performance Cluster Computing Systems

An Efficient Live Process Migration Approach for High Performance Cluster Computing Systems An Efficient Live Process Migration Approach for High Performance Cluster Computing Systems Ehsan Mousavi Khaneghah, Najmeh Osouli Nezhad, Seyedeh Leili Mirtaheri, Mohsen Sharifi, and Ashakan Shirpour

More information

Resource and Service Trading in a Heterogeneous Large Distributed

Resource and Service Trading in a Heterogeneous Large Distributed Resource and Service Trading in a Heterogeneous Large Distributed ying@deakin.edu.au Y. Ni School of Computing and Mathematics Deakin University Geelong, Victoria 3217, Australia ang@deakin.edu.au Abstract

More information

An Overview of Process Management in the RHODOS System 1

An Overview of Process Management in the RHODOS System 1 An Overview of Process Management in the RHODOS System 1 Damien De Paoli and Andrzej Goscinski (ddp@deakin.edu.au, ang@deakin.edu.au) School of Computing and Mathematics Deakin University Geelong, Victoria

More information

Distributed Operating System Shilpa Yadav; Tanushree & Yashika Arora

Distributed Operating System Shilpa Yadav; Tanushree & Yashika Arora Distributed Operating System Shilpa Yadav; Tanushree & Yashika Arora A Distributed operating system is software over collection of communicating, networked, independent and with physically separate computational

More information

An Introduction to the Amoeba Distributed Operating System Apan Qasem Computer Science Department Florida State University

An Introduction to the Amoeba Distributed Operating System Apan Qasem Computer Science Department Florida State University An Introduction to the Amoeba Distributed Operating System Apan Qasem Computer Science Department Florida State University qasem@cs.fsu.edu Abstract The Amoeba Operating System has been in use in academia,

More information

A Comparison of Two Distributed Systems: Amoeba & Sprite. By: Fred Douglis, John K. Ousterhout, M. Frans Kaashock, Andrew Tanenbaum Dec.

A Comparison of Two Distributed Systems: Amoeba & Sprite. By: Fred Douglis, John K. Ousterhout, M. Frans Kaashock, Andrew Tanenbaum Dec. A Comparison of Two Distributed Systems: Amoeba & Sprite By: Fred Douglis, John K. Ousterhout, M. Frans Kaashock, Andrew Tanenbaum Dec. 1991 Introduction shift from time-sharing to multiple processors

More information

Designing Issues For Distributed Computing System: An Empirical View

Designing Issues For Distributed Computing System: An Empirical View ISSN: 2278 0211 (Online) Designing Issues For Distributed Computing System: An Empirical View Dr. S.K Gandhi, Research Guide Department of Computer Science & Engineering, AISECT University, Bhopal (M.P),

More information

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Donald S. Miller Department of Computer Science and Engineering Arizona State University Tempe, AZ, USA Alan C.

More information

Microkernels and Client- Server Architectures

Microkernels and Client- Server Architectures Microkernels and Client- Server Architectures I m not interested in making devices look like user-level. They aren t, they shouldn t, and microkernels are just stupid. Linus Torwalds 1 Motivation Early

More information

Micro-kernel support for migration

Micro-kernel support for migration Dlstrib. Syst. Engng 1 (1994) 212-223. Printed in the UK Micro-kernel support for migration Martin OConnort, Brendan Tangney, Vinny Cahill and Neville Harris$ Distributed Systems Group, Department of Computer

More information

Next Generation Operating Systems Architecture

Next Generation Operating Systems Architecture CS/TR-91-104 Next Generation Operating Systems Architecture Michel Gien Chorus systèmes 6 avenue Gustave Eiffel, F 78182 Saint-Quentin-en-Yvelines (France) tel: +33 1 30 64 82 00, fax: +33 1 30 57 00 66,

More information

Part V. Process Management. Sadeghi, Cubaleska RUB Course Operating System Security Memory Management and Protection

Part V. Process Management. Sadeghi, Cubaleska RUB Course Operating System Security Memory Management and Protection Part V Process Management Sadeghi, Cubaleska RUB 2008-09 Course Operating System Security Memory Management and Protection Roadmap of Chapter 5 Notion of Process and Thread Data Structures Used to Manage

More information

Transparent Network Connectivity in Dynamic Cluster Environments

Transparent Network Connectivity in Dynamic Cluster Environments Transparent Network Connectivity in Dynamic Cluster Environments Xiaodong Fu, Hua Wang, and Vijay Karamcheti Department of Computer Science New York University xiaodong, wanghua, vijayk @cs.nyu.edu Abstract

More information

Operating System Support

Operating System Support Operating System Support Dr. Xiaobo Zhou Adopted from Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, Addison-Wesley 2005 1 Learning Objectives Know what a modern

More information

Distributed OS and Algorithms

Distributed OS and Algorithms Distributed OS and Algorithms Fundamental concepts OS definition in general: OS is a collection of software modules to an extended machine for the users viewpoint, and it is a resource manager from the

More information

Mach: the core of Apple s OS X

Mach: the core of Apple s OS X Musick 1 Mach: the core of Apple s OS X CS-384, Operating System Design Winter 2005-2006 Submitted to: Dr. Meier Submitted by: Erich Musick Date Submitted: February 23, 2006 Musick 2 Table of Contents

More information

Process Description and Control. Chapter 3

Process Description and Control. Chapter 3 Process Description and Control Chapter 3 Contents Process states Process description Process control Unix process management Process From processor s point of view execute instruction dictated by program

More information

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 4, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts

More information

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems

On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems On Object Orientation as a Paradigm for General Purpose Distributed Operating Systems Vinny Cahill, Sean Baker, Brendan Tangney, Chris Horn and Neville Harris Distributed Systems Group, Dept. of Computer

More information

Networking Performance for Microkernels. Chris Maeda. Carnegie Mellon University. Pittsburgh, PA March 17, 1992

Networking Performance for Microkernels. Chris Maeda. Carnegie Mellon University. Pittsburgh, PA March 17, 1992 Networking Performance for Microkernels Chris Maeda Brian N. Bershad School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 March 17, 1992 Abstract Performance measurements of network

More information

CS533 Concepts of Operating Systems. Jonathan Walpole

CS533 Concepts of Operating Systems. Jonathan Walpole CS533 Concepts of Operating Systems Jonathan Walpole Improving IPC by Kernel Design & The Performance of Micro- Kernel Based Systems The IPC Dilemma IPC is very import in µ-kernel design - Increases modularity,

More information

06-Dec-17. Credits:4. Notes by Pritee Parwekar,ANITS 06-Dec-17 1

06-Dec-17. Credits:4. Notes by Pritee Parwekar,ANITS 06-Dec-17 1 Credits:4 1 Understand the Distributed Systems and the challenges involved in Design of the Distributed Systems. Understand how communication is created and synchronized in Distributed systems Design and

More information

CPU scheduling. Alternating sequence of CPU and I/O bursts. P a g e 31

CPU scheduling. Alternating sequence of CPU and I/O bursts. P a g e 31 CPU scheduling CPU scheduling is the basis of multiprogrammed operating systems. By switching the CPU among processes, the operating system can make the computer more productive. In a single-processor

More information

Process size is independent of the main memory present in the system.

Process size is independent of the main memory present in the system. Hardware control structure Two characteristics are key to paging and segmentation: 1. All memory references are logical addresses within a process which are dynamically converted into physical at run time.

More information

Microkernels. Overview. Required reading: Improving IPC by kernel design

Microkernels. Overview. Required reading: Improving IPC by kernel design Microkernels Required reading: Improving IPC by kernel design Overview This lecture looks at the microkernel organization. In a microkernel, services that a monolithic kernel implements in the kernel are

More information

Operating System Support

Operating System Support Teaching material based on Distributed Systems: Concepts and Design, Edition 3, Addison-Wesley 2001. Copyright George Coulouris, Jean Dollimore, Tim Kindberg 2001 email: authors@cdk2.net This material

More information

Multiprocessor and Real- Time Scheduling. Chapter 10

Multiprocessor and Real- Time Scheduling. Chapter 10 Multiprocessor and Real- Time Scheduling Chapter 10 Classifications of Multiprocessor Loosely coupled multiprocessor each processor has its own memory and I/O channels Functionally specialized processors

More information

THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL. Jun Sun, Yasushi Shinjo and Kozo Itano

THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL. Jun Sun, Yasushi Shinjo and Kozo Itano THE IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM SUPPORTING THE PARALLEL WORLD MODEL Jun Sun, Yasushi Shinjo and Kozo Itano Institute of Information Sciences and Electronics University of Tsukuba Tsukuba,

More information

Notes on the Implementation of a Remote Fork Mechanism

Notes on the Implementation of a Remote Fork Mechanism Notes on the Implementation of a Remote Fork Mechanism Jonathan M. Smith John Ioannidis Computer Science Department Columbia University New York, NY 10027 ABSTRACT We describe a method for implementing

More information

REAL-TIME MULTITASKING KERNEL FOR IBM-BASED MICROCOMPUTERS

REAL-TIME MULTITASKING KERNEL FOR IBM-BASED MICROCOMPUTERS Malaysian Journal of Computer Science, Vol. 9 No. 1, June 1996, pp. 12-17 REAL-TIME MULTITASKING KERNEL FOR IBM-BASED MICROCOMPUTERS Mohammed Samaka School of Computer Science Universiti Sains Malaysia

More information

What s in a process?

What s in a process? CSE 451: Operating Systems Winter 2015 Module 5 Threads Mark Zbikowski mzbik@cs.washington.edu Allen Center 476 2013 Gribble, Lazowska, Levy, Zahorjan What s in a process? A process consists of (at least):

More information

Process Description and Control

Process Description and Control Process Description and Control Chapter 3 Muhammad Adri, MT 1 Major Requirements of an Operating System Interleave the execution of several processes to maximize processor utilization while providing reasonable

More information

Distributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne

Distributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne Distributed Computing: PVM, MPI, and MOSIX Multiple Processor Systems Dr. Shaaban Judd E.N. Jenne May 21, 1999 Abstract: Distributed computing is emerging as the preferred means of supporting parallel

More information

A Capabilities Based Communication Model for High-Performance Distributed Applications: The Open HPC++ Approach

A Capabilities Based Communication Model for High-Performance Distributed Applications: The Open HPC++ Approach A Capabilities Based Communication Model for High-Performance Distributed Applications: The Open HPC++ Approach Shridhar Diwan, Dennis Gannon Department of Computer Science Indiana University Bloomington,

More information

1 PROCESSES PROCESS CONCEPT The Process Process State Process Control Block 5

1 PROCESSES PROCESS CONCEPT The Process Process State Process Control Block 5 Process Management A process can be thought of as a program in execution. A process will need certain resources such as CPU time, memory, files, and I/O devices to accomplish its task. These resources

More information

Process Description and Control. Major Requirements of an Operating System

Process Description and Control. Major Requirements of an Operating System Process Description and Control Chapter 3 1 Major Requirements of an Operating System Interleave the execution of several processes to maximize processor utilization while providing reasonable response

More information

Major Requirements of an Operating System Process Description and Control

Major Requirements of an Operating System Process Description and Control Major Requirements of an Operating System Process Description and Control Chapter 3 Interleave the execution of several processes to maximize processor utilization while providing reasonable response time

More information

Virtual Memory - Overview. Programmers View. Virtual Physical. Virtual Physical. Program has its own virtual memory space.

Virtual Memory - Overview. Programmers View. Virtual Physical. Virtual Physical. Program has its own virtual memory space. Virtual Memory - Overview Programmers View Process runs in virtual (logical) space may be larger than physical. Paging can implement virtual. Which pages to have in? How much to allow each process? Program

More information

Operating Systems Overview. Chapter 2

Operating Systems Overview. Chapter 2 Operating Systems Overview Chapter 2 Operating System A program that controls the execution of application programs An interface between the user and hardware Masks the details of the hardware Layers and

More information

Kevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a

Kevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a Asynchronous Checkpointing for PVM Requires Message-Logging Kevin Skadron 18 April 1994 Abstract Distributed computing using networked workstations oers cost-ecient parallel computing, but the higher rate

More information

Threads, SMP, and Microkernels. Chapter 4

Threads, SMP, and Microkernels. Chapter 4 Threads, SMP, and Microkernels Chapter 4 Processes Resource ownership - process is allocated a virtual address space to hold the process image Dispatched - process is an execution path through one or more

More information

Process Description and Control

Process Description and Control Process Description and Control 1 Process:the concept Process = a program in execution Example processes: OS kernel OS shell Program executing after compilation www-browser Process management by OS : Allocate

More information

Performance of PVM with the MOSIX Preemptive Process Migration Scheme *

Performance of PVM with the MOSIX Preemptive Process Migration Scheme * Performance of PVM with the MOSIX Preemptive Process Migration Scheme * Amnon Barak, Avner Braverman, Ilia Gilderman and Oren Laden Institute of Computer Science The Hebrew University of Jerusalem Jerusalem

More information

TRANSPARENT FAULT-TOLERANCE IN PARALLEL ORCA PROGRAMS

TRANSPARENT FAULT-TOLERANCE IN PARALLEL ORCA PROGRAMS TRANSPARENT FAULT-TOLERANCE IN PARALLEL ORCA PROGRAMS M. Frans Kaashoek (kaashoek@cs.vu.nl) Raymond Michiels (raymond@cs.vu.nl) Henri E. Bal (bal@cs.vu.nl) Andrew S. Tanenbaum (ast@cs.vu.nl) Vrije Universiteit,

More information

Transparent Process Migration for Personal Workstations*

Transparent Process Migration for Personal Workstations* Transparent Process Migration for Personal Workstations* Fred Douglis John Ousterhout Computer Science Division Electrical Engineering and Computer Sciences University of California Berkeley, CA 94720

More information

3.1 Introduction. Computers perform operations concurrently

3.1 Introduction. Computers perform operations concurrently PROCESS CONCEPTS 1 3.1 Introduction Computers perform operations concurrently For example, compiling a program, sending a file to a printer, rendering a Web page, playing music and receiving e-mail Processes

More information

Lecture 7: February 10

Lecture 7: February 10 CMPSCI 677 Operating Systems Spring 2016 Lecture 7: February 10 Lecturer: Prashant Shenoy Scribe: Tao Sun 7.1 Server Design Issues 7.1.1 Server Design There are two types of server design choices: Iterative

More information

Chapter 17: Distributed Systems (DS)

Chapter 17: Distributed Systems (DS) Chapter 17: Distributed Systems (DS) Silberschatz, Galvin and Gagne 2013 Chapter 17: Distributed Systems Advantages of Distributed Systems Types of Network-Based Operating Systems Network Structure Communication

More information

Announcement. Exercise #2 will be out today. Due date is next Monday

Announcement. Exercise #2 will be out today. Due date is next Monday Announcement Exercise #2 will be out today Due date is next Monday Major OS Developments 2 Evolution of Operating Systems Generations include: Serial Processing Simple Batch Systems Multiprogrammed Batch

More information

Process Description and Control

Process Description and Control Process Description and Control 1 summary basic concepts process control block process trace process dispatching process states process description process control 2 Process A program in execution (running)

More information

DFS Case Studies, Part 2. The Andrew File System (from CMU)

DFS Case Studies, Part 2. The Andrew File System (from CMU) DFS Case Studies, Part 2 The Andrew File System (from CMU) Case Study Andrew File System Designed to support information sharing on a large scale by minimizing client server communications Makes heavy

More information

The L4 microkernel. Garland, Mehta, Roehricht, Schulze. CS-450 Section 3 Operating Systems Fall 2003 James Madison University Harrisonburg, VA

The L4 microkernel. Garland, Mehta, Roehricht, Schulze. CS-450 Section 3 Operating Systems Fall 2003 James Madison University Harrisonburg, VA Garland, Mehta, Roehricht, Schulze The L4 microkernel Harrisonburg, November 29, 2003 CS-450 Section 3 Operating Systems Fall 2003 James Madison University Harrisonburg, VA Contents 1 An Introduction to

More information

Windows 7 Overview. Windows 7. Objectives. The History of Windows. CS140M Fall Lake 1

Windows 7 Overview. Windows 7. Objectives. The History of Windows. CS140M Fall Lake 1 Windows 7 Overview Windows 7 Overview By Al Lake History Design Principles System Components Environmental Subsystems File system Networking Programmer Interface Lake 2 Objectives To explore the principles

More information

OS structure. Process management. Major OS components. CSE 451: Operating Systems Spring Module 3 Operating System Components and Structure

OS structure. Process management. Major OS components. CSE 451: Operating Systems Spring Module 3 Operating System Components and Structure CSE 451: Operating Systems Spring 2012 Module 3 Operating System Components and Structure Ed Lazowska lazowska@cs.washington.edu Allen Center 570 The OS sits between application programs and the it mediates

More information

Major Requirements of an OS

Major Requirements of an OS Process CSCE 351: Operating System Kernels Major Requirements of an OS Interleave the execution of several processes to maximize processor utilization while providing reasonable response time Allocate

More information

Distributed Systems. Definitions. Why Build Distributed Systems? Operating Systems - Overview. Operating Systems - Overview

Distributed Systems. Definitions. Why Build Distributed Systems? Operating Systems - Overview. Operating Systems - Overview Distributed Systems Joseph Spring School of Computer Science Distributed Systems and Security Areas for Discussion Definitions Operating Systems Overview Challenges Heterogeneity Limitations and 2 Definitions

More information

DISTRIBUTED COMPUTER SYSTEMS

DISTRIBUTED COMPUTER SYSTEMS DISTRIBUTED COMPUTER SYSTEMS Communication Fundamental REMOTE PROCEDURE CALL Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Outline Communication Architecture Fundamentals

More information

Distributed Shared Memory: A Survey

Distributed Shared Memory: A Survey Distributed Shared Memory: A Survey J. Silcock School of Computing and Mathematics, Deakin University, Geelong. Abstract Distributed Shared Memory is an important topic in distributed system research as

More information

Load Balancing by Allocation of User Login Sessions. Peter Smith and Paul Ashton TR COSC 05/92

Load Balancing by Allocation of User Login Sessions. Peter Smith and Paul Ashton TR COSC 05/92 Load Balancing by Allocation of User Login Sessions Peter Smith and Paul Ashton TR COSC 05/92 Department of Computer Science University of Canterbury Private Bag 4800 Christchurch New Zealand Load Balancing

More information

Multiprocessor and Real-Time Scheduling. Chapter 10

Multiprocessor and Real-Time Scheduling. Chapter 10 Multiprocessor and Real-Time Scheduling Chapter 10 1 Roadmap Multiprocessor Scheduling Real-Time Scheduling Linux Scheduling Unix SVR4 Scheduling Windows Scheduling Classifications of Multiprocessor Systems

More information

Module 4: Processes. Process Concept Process Scheduling Operation on Processes Cooperating Processes Interprocess Communication

Module 4: Processes. Process Concept Process Scheduling Operation on Processes Cooperating Processes Interprocess Communication Module 4: Processes Process Concept Process Scheduling Operation on Processes Cooperating Processes Interprocess Communication Operating System Concepts 4.1 Process Concept An operating system executes

More information

IT 540 Operating Systems ECE519 Advanced Operating Systems

IT 540 Operating Systems ECE519 Advanced Operating Systems IT 540 Operating Systems ECE519 Advanced Operating Systems Prof. Dr. Hasan Hüseyin BALIK (3 rd Week) (Advanced) Operating Systems 3. Process Description and Control 3. Outline What Is a Process? Process

More information

Module 4: Processes. Process Concept Process Scheduling Operation on Processes Cooperating Processes Interprocess Communication

Module 4: Processes. Process Concept Process Scheduling Operation on Processes Cooperating Processes Interprocess Communication Module 4: Processes Process Concept Process Scheduling Operation on Processes Cooperating Processes Interprocess Communication 4.1 Process Concept An operating system executes a variety of programs: Batch

More information

Lightweight Remote Procedure Call

Lightweight Remote Procedure Call Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, Henry M. Levy ACM Transactions Vol. 8, No. 1, February 1990, pp. 37-55 presented by Ian Dees for PSU CS533, Jonathan

More information

Previous Work in Distributed Operating Systems. NOW Retreat. Kim Keeton, Steve Rodrigues, and Drew Roselli. April 3, 1995

Previous Work in Distributed Operating Systems. NOW Retreat. Kim Keeton, Steve Rodrigues, and Drew Roselli. April 3, 1995 Previous Work in Distributed Operating Systems NOW Retreat Kim Keeton, Steve Rodrigues, and Drew Roselli April 3, 1995 Remote Ex. Parallel Jobs Design Compatibility Fault Tol. RX TR Mi PJ JC GS DR IR DC

More information

SMD149 - Operating Systems

SMD149 - Operating Systems SMD149 - Operating Systems Roland Parviainen November 3, 2005 1 / 45 Outline Overview 2 / 45 Process (tasks) are necessary for concurrency Instance of a program in execution Next invocation of the program

More information

ELEC 377 Operating Systems. Week 1 Class 2

ELEC 377 Operating Systems. Week 1 Class 2 Operating Systems Week 1 Class 2 Labs vs. Assignments The only work to turn in are the labs. In some of the handouts I refer to the labs as assignments. There are no assignments separate from the labs.

More information

1995 Paper 10 Question 7

1995 Paper 10 Question 7 995 Paper 0 Question 7 Why are multiple buffers often used between producing and consuming processes? Describe the operation of a semaphore. What is the difference between a counting semaphore and a binary

More information

Modelling a Video-on-Demand Service over an Interconnected LAN and ATM Networks

Modelling a Video-on-Demand Service over an Interconnected LAN and ATM Networks Modelling a Video-on-Demand Service over an Interconnected LAN and ATM Networks Kok Soon Thia and Chen Khong Tham Dept of Electrical Engineering National University of Singapore Tel: (65) 874-5095 Fax:

More information

BAG Distributed Real-Time Operating System and Task Migration

BAG Distributed Real-Time Operating System and Task Migration Turk J Elec Engin, VOL.9, NO.2 2001, c TÜBİTAK BAG Distributed Real-Time Operating System and Task Migration Bekir Tevfik AKGÜN Computer Engineering Department, Faculty of Electrical and Electronic Engineering,

More information

Chapter 8 Virtual Memory

Chapter 8 Virtual Memory Chapter 8 Virtual Memory Contents Hardware and control structures Operating system software Unix and Solaris memory management Linux memory management Windows 2000 memory management Characteristics of

More information

Machine-Independent Virtual Memory Management for Paged June Uniprocessor 1st, 2010and Multiproce 1 / 15

Machine-Independent Virtual Memory Management for Paged June Uniprocessor 1st, 2010and Multiproce 1 / 15 Machine-Independent Virtual Memory Management for Paged Uniprocessor and Multiprocessor Architectures Matthias Lange TU Berlin June 1st, 2010 Machine-Independent Virtual Memory Management for Paged June

More information

What s in a traditional process? Concurrency/Parallelism. What s needed? CSE 451: Operating Systems Autumn 2012

What s in a traditional process? Concurrency/Parallelism. What s needed? CSE 451: Operating Systems Autumn 2012 What s in a traditional process? CSE 451: Operating Systems Autumn 2012 Ed Lazowska lazowska @cs.washi ngton.edu Allen Center 570 A process consists of (at least): An, containing the code (instructions)

More information

Operating Systems: Internals and Design Principles. Chapter 2 Operating System Overview Seventh Edition By William Stallings

Operating Systems: Internals and Design Principles. Chapter 2 Operating System Overview Seventh Edition By William Stallings Operating Systems: Internals and Design Principles Chapter 2 Operating System Overview Seventh Edition By William Stallings Operating Systems: Internals and Design Principles Operating systems are those

More information

Overview of the CHORUS Distributed Operating Systems

Overview of the CHORUS Distributed Operating Systems CS/TR-90-25.1 Overview of the CHORUS Distributed Operating Systems M. Rozier, V. Abrossimov, F. Armand, I. Boule, M. Gien, M. Guillemont F. Herrmann, C. Kaiser *, S. Langlois, P. Léonard, W. Neuhauser

More information

Region-based Software Distributed Shared Memory

Region-based Software Distributed Shared Memory Region-based Software Distributed Shared Memory Song Li, Yu Lin, and Michael Walker CS 656 Operating Systems May 5, 2000 Abstract In this paper, we describe the implementation of a software-based DSM model

More information

Distributed Systems Theory 4. Remote Procedure Call. October 17, 2008

Distributed Systems Theory 4. Remote Procedure Call. October 17, 2008 Distributed Systems Theory 4. Remote Procedure Call October 17, 2008 Client-server model vs. RPC Client-server: building everything around I/O all communication built in send/receive distributed computing

More information

CS 167 Final Exam Solutions

CS 167 Final Exam Solutions CS 167 Final Exam Solutions Spring 2018 Do all questions. 1. [20%] This question concerns a system employing a single (single-core) processor running a Unix-like operating system, in which interrupts are

More information

SIMULATION BASED ANALYSIS OF THE INTERACTION OF END-TO-END AND HOP-BY-HOP FLOW CONTROL SCHEMES IN PACKET SWITCHING LANS

SIMULATION BASED ANALYSIS OF THE INTERACTION OF END-TO-END AND HOP-BY-HOP FLOW CONTROL SCHEMES IN PACKET SWITCHING LANS SIMULATION BASED ANALYSIS OF THE INTERACTION OF END-TO-END AND HOP-BY-HOP FLOW CONTROL SCHEMES IN PACKET SWITCHING LANS J Wechta, A Eberlein, F Halsall and M Spratt Abstract To meet the networking requirements

More information

Lecture 2 Process Management

Lecture 2 Process Management Lecture 2 Process Management Process Concept An operating system executes a variety of programs: Batch system jobs Time-shared systems user programs or tasks The terms job and process may be interchangeable

More information

Assignment 5. Georgia Koloniari

Assignment 5. Georgia Koloniari Assignment 5 Georgia Koloniari 2. "Peer-to-Peer Computing" 1. What is the definition of a p2p system given by the authors in sec 1? Compare it with at least one of the definitions surveyed in the last

More information

Concurrent & Distributed Systems Supervision Exercises

Concurrent & Distributed Systems Supervision Exercises Concurrent & Distributed Systems Supervision Exercises Stephen Kell Stephen.Kell@cl.cam.ac.uk November 9, 2009 These exercises are intended to cover all the main points of understanding in the lecture

More information

Chapter 3 Processes. Process Concept. Process Concept. Process Concept (Cont.) Process Concept (Cont.) Process Concept (Cont.)

Chapter 3 Processes. Process Concept. Process Concept. Process Concept (Cont.) Process Concept (Cont.) Process Concept (Cont.) Process Concept Chapter 3 Processes Computers can do several activities at a time Executing user programs, reading from disks writing to a printer, etc. In multiprogramming: CPU switches from program to

More information

Threads. Computer Systems. 5/12/2009 cse threads Perkins, DW Johnson and University of Washington 1

Threads. Computer Systems.   5/12/2009 cse threads Perkins, DW Johnson and University of Washington 1 Threads CSE 410, Spring 2009 Computer Systems http://www.cs.washington.edu/410 5/12/2009 cse410-20-threads 2006-09 Perkins, DW Johnson and University of Washington 1 Reading and References Reading» Read

More information

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 13 Virtual memory and memory management unit In the last class, we had discussed

More information

Lecture 17: Threads and Scheduling. Thursday, 05 Nov 2009

Lecture 17: Threads and Scheduling. Thursday, 05 Nov 2009 CS211: Programming and Operating Systems Lecture 17: Threads and Scheduling Thursday, 05 Nov 2009 CS211 Lecture 17: Threads and Scheduling 1/22 Today 1 Introduction to threads Advantages of threads 2 User

More information

The modularity requirement

The modularity requirement The modularity requirement The obvious complexity of an OS and the inherent difficulty of its design lead to quite a few problems: an OS is often not completed on time; It often comes with quite a few

More information

Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes. Todd A. Whittaker Ohio State University

Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes. Todd A. Whittaker Ohio State University Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes Todd A. Whittaker Ohio State University whittake@cis.ohio-state.edu Kathy J. Liszka The University of Akron liszka@computer.org

More information

Chapter 3. Design of Grid Scheduler. 3.1 Introduction

Chapter 3. Design of Grid Scheduler. 3.1 Introduction Chapter 3 Design of Grid Scheduler The scheduler component of the grid is responsible to prepare the job ques for grid resources. The research in design of grid schedulers has given various topologies

More information

Use of interaction networks in teaching Minix

Use of interaction networks in teaching Minix Use of interaction networks in teaching Minix Paul Ashton, Carl Cerecke, Craig McGeachie, Stuart Yeates Department of Computer Science University of Canterbury TR-COSC 08/95, Sep 1995 The contents of this

More information

Processes and Non-Preemptive Scheduling. Otto J. Anshus

Processes and Non-Preemptive Scheduling. Otto J. Anshus Processes and Non-Preemptive Scheduling Otto J. Anshus Threads Processes Processes Kernel An aside on concurrency Timing and sequence of events are key concurrency issues We will study classical OS concurrency

More information

Today s class. Scheduling. Informationsteknologi. Tuesday, October 9, 2007 Computer Systems/Operating Systems - Class 14 1

Today s class. Scheduling. Informationsteknologi. Tuesday, October 9, 2007 Computer Systems/Operating Systems - Class 14 1 Today s class Scheduling Tuesday, October 9, 2007 Computer Systems/Operating Systems - Class 14 1 Aim of Scheduling Assign processes to be executed by the processor(s) Need to meet system objectives regarding:

More information

Operating Systems. Lecture 3- Process Description and Control. Masood Niazi Torshiz

Operating Systems. Lecture 3- Process Description and Control. Masood Niazi Torshiz Operating Systems Lecture 3- Process Description and Control Masood Niazi Torshiz www.mniazi.ir 1 Requirements of an Operating System Interleave the execution of multiple processes to maximize processor

More information

Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking

Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking Di-Shi Sun and Douglas M. Blough School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA

More information

Operating Systems, Fall

Operating Systems, Fall EXAM: Thu 22.10. 9.00 CK112 Operating Systems: Wrap-up Fall 2009 Tiina Niklander Questions both in English and Finnish. You may answer in Finnish, Swedish or English. No additional material allowed. You

More information

CSC Operating Systems Fall Lecture - II OS Structures. Tevfik Ko!ar. Louisiana State University. August 27 th, 2009.

CSC Operating Systems Fall Lecture - II OS Structures. Tevfik Ko!ar. Louisiana State University. August 27 th, 2009. CSC 4103 - Operating Systems Fall 2009 Lecture - II OS Structures Tevfik Ko!ar Louisiana State University August 27 th, 2009 1 Announcements TA Changed. New TA: Praveenkumar Kondikoppa Email: pkondi1@lsu.edu

More information

Announcements. Computer System Organization. Roadmap. Major OS Components. Processes. Tevfik Ko!ar. CSC Operating Systems Fall 2009

Announcements. Computer System Organization. Roadmap. Major OS Components. Processes. Tevfik Ko!ar. CSC Operating Systems Fall 2009 CSC 4103 - Operating Systems Fall 2009 Lecture - II OS Structures Tevfik Ko!ar TA Changed. New TA: Praveenkumar Kondikoppa Email: pkondi1@lsu.edu Announcements All of you should be now in the class mailing

More information

10/10/ Gribble, Lazowska, Levy, Zahorjan 2. 10/10/ Gribble, Lazowska, Levy, Zahorjan 4

10/10/ Gribble, Lazowska, Levy, Zahorjan 2. 10/10/ Gribble, Lazowska, Levy, Zahorjan 4 What s in a process? CSE 451: Operating Systems Autumn 2010 Module 5 Threads Ed Lazowska lazowska@cs.washington.edu Allen Center 570 A process consists of (at least): An, containing the code (instructions)

More information

Outline. Interprocess Communication. Interprocess Communication. Communication Models: Message Passing and shared Memory.

Outline. Interprocess Communication. Interprocess Communication. Communication Models: Message Passing and shared Memory. Eike Ritter 1 Modified: October 29, 2012 Lecture 14: Operating Systems with C/C++ School of Computer Science, University of Birmingham, UK Outline 1 2 3 Shared Memory in POSIX systems 1 Based on material

More information

Distributed Systems LEEC (2006/07 2º Sem.)

Distributed Systems LEEC (2006/07 2º Sem.) Distributed Systems LEEC (2006/07 2º Sem.) Introduction João Paulo Carvalho Universidade Técnica de Lisboa / Instituto Superior Técnico Outline Definition of a Distributed System Goals Connecting Users

More information