A test bed for distributed real-time scheduling experimentation based on the CHORUS micro-kernel

A test bed for distributed real-time scheduling experimentation based on the CHORUS micro-kernel Joelle DELACROIX**, Christian SANTELLANI* ** CNAM, Département Informatique, 292 rue Saint Martin, 75141 Paris Cedex 03 Tel : 40 27 28 81 E-Mail : delacroi@cnam.cnam.fr * CNAM, Département d'informatique MOTS-CLÉS : real-time distributed scheduling, micro-kernel, Chorus system, Earliest Deadline algorithm, load sharing, overload, stability 1. Introduction Real-time systems are playing an increasingly vital role in today s society. Such systems include for example manufacturing, aerospace, robotics or military systems. In these real-time applications, computing systems must control a controlled system and keep it in a safe state. So, computing system operations have to be perform within a given maximum time. Serious damages can result from an unguaranteed deadline, then exceeding a given delay (critical delay) is often classed as a system breakdown. While scheduling real-time tasks, the critical delay has to be taken into account first and the classical scheduling performance criteria as CPU utilization, throughput, equal allocation or response time are only examinated in second position. Time constraints have repercussions on the computing systems which have to be robust and predictable. Then the operating system must include services dedicated to time management, tasks management or interrupts management, which are handled by a real-time kernel. Basic properties of these services, which are listed by standards like POSIX or SCEPTRE[1], are fast execution and predictability, so that time executions of the primitives can be bounded. So real-time kernel performances are often given in terms of context switching time or in terms of interrupts latency. However, how 1

tasks are scheduled is the most important criterion which ensures time constraints meeting. Instead of the classical time-sharing scheduling algorithm of the multiprogramming systems, real-time kernels provide a preemptive priority scheduling algorithm which ensure, that at time t, the processor is allocated to the task which has the higher priority. The priority of a task is a simple arbitrary integer value and it has to reflect the task deadline and the task importance too. However, task urgency and task importance can of course mismatch. The deadline describes only the urgency of a task, not its importance and the two criteria unluckily do not always agree. Some low important tasks may have earlier deadlines and some high important tasks larger deadlines. So using only an integer arbitrary priority to schedule tasks of an real-time application is quite simple, but it is not suitable for ensuring the respect of the time constraints. So real-time kernels have to be modified and improved so that time constrainst of the tasks are meeting. To do that, we decide to add to the real-time kernel three fondamental concepts which are put forward by researchers soon a long time ago : - separate deadline criterion and importance criterion - load sharing concept. Adding deadline concept to the micro-kernel scheduler should ensure better meeting of the time constraints because the deadline of a task really shows its urgency. Adding importance criterion and load sharing concept to the micro-kernel scheduler should ensure better handle of overload situations. Then, when a node will experimente a transient overload, local task execution cancellations made according to the importance criterion and task migrations among the nodes should help the overload node to fall again in an normal load situation. Adding of these concepts is made inside the real-time micro-kernel of the Chorus system. Since the 70s, a lot of researches has been done about deadline scheduling and about load sharing concept. Algorithms like the Rate Monotonic scheduling algorithm, like the Earliest Deadline scheduling algorithm [2] and like the flexible algorithm [3] have been defined. More recently, importance criterion has been defined so that, on a node, overload situations can be resorbed using task execution cancellations. This importance criterion can be either a simple integer which is managed by the Regisseur entity [7][8][9] or a time dependant function [5][6]. However, many of these researches are only theoritical and only a few of them have been implemented : experimentations made with the MACH kernel [10] or with the SPRING system [11] can also be mentionned. 2. Earliest Deadline scheduling algorithm integration Adding of the deadline concept [16][17][18] has been made by the integration of the Earliest Deadline scheduling algorithm inside the Chorus micro-kernel. Althrough it is not the most 2

popular scheduling algorithm, it has been chosen instead of Rate Monotonic algorithm because of the four following reasons : - Its scheduling criterion is based on the critical delay, a temporal parameter which well defines the urgency of a task. - The critical delay is defined for the periodic tasks and for the aperiodic tasks too. So these two kinds of tasks can be scheduled together. However, using Rate Monotonic scheduling algorithm, aperiodic tasks should be either served by a dedicated server task [13] or should be directly incorporated in the set of periodic ones, by determining a period for the aperiodic task which is the probably minimal request inter-arrival time [12]. - It is low cost - It is optimal when no overload occurs. 3. Importance criterion integration and load sharing integration The integration [19] of the concept of Importance was achieved thanks to works already done about the Régisseur entity. This integration and the integration of the load sharing concept complete the adding of the Earliest Deadline algorithm in the Chorus micro-kernel. It allows having an usable mechanism to manage overloads. Indeed, the Earliest Deadline algorithm, if it is optimal when there is no overload, has an unspecified behaviour when an overload occurs (i.e. when a task runs longer than expected, or when, for instance, a lot of aperiodic tasks are released). This behaviour involves that some very important tasks for the system may not meet their deadline. The mechanism, added in the Chorus microkernel, resorbs an overload on a node, as following : each time a task is released, a guarantee function checks the schedulability of the set of the ready tasks according to Earliest Deadline scheduling algorithm. Schedulability test simply computes the laxity L of each ready request and an overload situation occurs when at least one of the ready request has a negative laxity value. Then the Régisseur entity resorbs an overload situation by two complementary ways : - the Regisseur cancels requests of the ready task set. These requests are either periodic ones or are requests which have soon beginning their execution. Requests are selected, request by request, by following strict increasing order of importance. - the "Régisseur" migrates requests. The requests are then aperiodic ones and they did not started yet. The load sharing algorithm which was implemented is the Stankovic's flexible algorithm. This algorithm was chosen because it is a full distributed and symmetrical algorithm and because it owns the whole features of a distributed scheduler. 3

We are now going to describe our test bed for distributed real-time scheduling experimentation based on the Chorus microkernel. Its architecture is divided in three layers : the distributed layer, the scheduling layer and the application layer. 4. General architecture of the test bed 4.1. The distributed layer This layer is at the lowest level of the architecture, it gathers several mechanisms : - a real-time communication protocol which guarantees transmission delays and allows us to build a global clock ; - a task migration mechanism which allows us to have a real distributed system ; - a mechanism for routing messages between the different tasks : since tasks can be migrated, they communicate only via messages which have to be routed properly in the distributed system. 4.2. The scheduling layer This layer is above the distributed layer, it implementes the scheduling algorithm which, briefly, is divided in three mechanisms : - the local scheduler : it is based on the Earliest Deadline algorithm and it integrates the different features which allow periodic and aperiodic task management ; - the Régisseur : it checks schedulability of ready tasks and resorbs overloads either by cancelling tasks or by migrating tasks. The overload resorbtion is made according to task importance values. - the function which computes the load of a node : when deciding to migrate a task, these load values are used by the Régisseur to choose the node where the task will be send. The load of a node is computed by periodically evaluating the idle time of the processor. 4.3. The application layer This layer is above the scheduling layer. It will implement an application with several levels of time constraints and great ressource allocation problems 4

5. Conclusion Computing systems must control a controlled system and keep it in a safe state. So, computing system operations have to be perform within a given maximum time. Serious damages can result from an unguaranteed deadline, then exceeding a given delay (critical delay) is often classed as a system breakdown. While scheduling real-time tasks, the critical delay has to be taken into account first. However, real-time micro-kernels only provide preemptive static priority schedulers which are not suitable to meet task time constraints. So, we decide to add in the Chorus micro-kernel three basic concepts which are put forward by researchers since a long time ago : deadline criterion, importance criterion and load sharing concept. The adding of the deadline criterion has been made by integrating Earliest Deadline scheduling algorithm in the Chorus micro-kernel. Importance criterion and load sharing concept should help an overload node to fall again into a normal load, either by task cancellations or by task migration. task cancellations and task migrations are managed by the Régisseur entity. REFERENCES [1] SCEPTRE : proposition de noyau normalisé pour les exécutifs temps réel, TSI, vol 3-1, 1984 [2] C.L LIU, J.W LAYLAND, "Scheduling Algorithms for Multiprogramming in a Hard Real- Time Environment", Journal of ACM, vol 20, n 1, janvier 1973, p. 46-61. [3] J.A. STANKOVIC, K. RAMAMRITHAM, S. CHENG, "Evaluation of a flexible task scheduling algorithm for distributed hard real-time systems", IEEE Transactions on Computers, Vol C-34, n 12, décembre 1985, p 1130-1143 [4] K.G. SHIN, Y.H. CHANG, "Load Sharing in distributed real-time systems with statechange broadcasts", IEEE Transactions on Computers, Vol 38, n 8, août 1989, p 1124-1142 [5] G. KOREN, D. SHASHA, D-OVER : An optimal on-line scheduling algorithm for overloaded real-time systems", Rapport technique de l'inria-rocquencourt, n 138, février 1992, 45 pages [6] E.D JENSEN, C.D LOCKE, H. TOKUDA, A Time-Driven Scheduling Model for Real- Time Operating Systems, Proceedings of 1985 IEEE Real-Time Systems Symposium, p 112-122. [7] J. DELACROIX, Un contrôleur d ordonnancement temps réel pour la stabilité de Earliest Deadline en surcharge : le Régisseur, thèse de doctorat de l Université Pierre et Marie Curie, spécialité informatique, janvier 1994, 217 pages [8] J.DELACROIX, Un contrôleur d ordonnancement temps réel : le Régisseur, RTS 94 actes des conférences, 11-14 janvier 1994, Paris, p 85-98. [9] J. DELACROIX, "Stabilité et Régisseur d'ordonnancement en temps réel", TSI, Vol 13, n 2, 1994 [10] H. TOKUDA et al, «Real-Time Mach : Towards a predictable Real-Time System», Proceedings of USENIX Mach Workshop, octobre 1990 5

[11] J.A. STANKOVIC, K. RAMAMRITHAM, "The SPRING Kernel : a new paradigm for real-time operating systems", Operating Systems Review, ACM, Vol 29-3, juillet 1989, p 54-71 [12] E. NASSOR et al, «Hard Real-Time Sporadic Task Scheduling for Fixed Priority Schedulers», International Workshop on Response Computer Systems (Office of Naval Research / INRIA), Golfe Juan, France, 1991 [13] B. SPRUNT et al, «Aperiodic Task Scheduling for Hard-Real-Time Systems», Journal of Real-Time Systems, Vol 1, 1989, p 27-60 [16] O. METAIS., "Implantation d un ordonnanceur à échéance au sein du micro-noyau Chorus", mémoire ingénieur Cnam en informatique, Mars 1994 [17] O. GAULTIER, "Le système réparti Chorus. Implantation d un ordonnanceur à échéance au sein du micro-noyau Chorus", mémoire ingénieur Cnam en informatique, Mars 1994. [18] J. DELACROIX, O. GAULTIER, O. METAIS, "Un ordonnanceur à échéance au sein du micro-noyau Chorus", soumis à TSI [19] C. SANTELLANI, "Synthèse sur l'ordonnancement temps réel réparti et implantation d'un ordonnanceur temps réel réparti au sein du micro-noyau Chorus", Rapport de stage du DEA de Systèmes Informatiques de l'université Pierre et Marie Curie, juin 1994. 6