A novel task scheduling algorithm based on dynamic critical path and effective duplication for pervasive computing environment

Size: px
Start display at page:

Download "A novel task scheduling algorithm based on dynamic critical path and effective duplication for pervasive computing environment"

Transcription

1 WIRELESS COMMUNICATIONS AND MOBILE COMPUTING Wirel. Commun. Mob. Comput. 2010; 10: Published online 12 December 2008 in Wiley Online Library (wileyonlinelibrary.com).717 A novel task scheduling algorithm based on dynamic critical path and effective duplication for pervasive computing environment Junzhou Luo, Fang Dong,, Jiuxin Cao and Aibo Song School of Computer Science and Engineering, Southeast University, Nanjing, P.R. China Summary In order to effectively utilize massive heterogeneous resources and provide transparent computing capability to upper applications, task scheduling as the key issue of pervasive computing system becomes significantly important. Previous proposed priority and duplication based task scheduling algorithms, which can be applied in pervasive computing environment, usually have following limitations: critical path cannot be calculated accurately while neglecting the effect of resource availability in scheduling; in duplication based resource allocation stage, duplications without restriction would lead to some negative effects on final schedule length (SL). For the purpose of solving these problems, a novel task scheduling algorithm based on dynamic critical path (DCP) and effective duplication, called DCPED, is presented in this paper. In DCPED, a more accurate DCP calculation method which takes resource availability into account is introduced. Meanwhile an effective task duplication strategy is proposed to eliminate ineffective duplications and make an optimized schedule result by using space compression technique and dynamic critical path length (DCPL) based evaluation technique respectively. Finally, simulation results show that DCPED can outperform previous algorithms significantly in NSL and speedup rate metrics. Especially, it is very effective for utilizing computing resources and scheduling the fine-grain and large-scale workflow applications in pervasive computing system. Copyright 2008 John Wiley & Sons, Ltd. KEY WORDS: pervasive computing; task scheduling; dynamic critical path; effective task duplication 1. Introduction With the widespread of Internet, many critical resources and computer devices, such as handheld and wearable computers, wireless LANs, and devices to sense and control appliances are increasingly shared through network. Meanwhile, with computing application penetrating into everyday life, people expect to use the information service anytime and anywhere with no impediment. Pervasive computing is accordingly generated. The whole concept of pervasive computing is to provide transparent computing ability to users as they wish anywhere and anytime, thus improving human experience and quality of life without explicit awareness of the underlying communications and computing technologies [1]. Correspondence to: Fang Dong, School of Computer Science and Engineering, Southeast University (Si Pai Lou 2#), Nanjing , P.R. China. fdong@seu.edu.cn Copyright 2008 John Wiley & Sons, Ltd.

2 1284 J. LUO ET AL. Nowadays, several pervasive computing projects have emerged at major universities, including project Aura [2] at Carnegie Mellon University; endeavour [3] at the University of California at Berkeley (UC Berkeley) and Oxygen [4] at the Massachusetts Institute of Technology (MIT). Although each of these projects addresses a different mixture of issues in pervasive computing, in common, the main objective of them is to provide the transparent computing capability to different people for executing users complex tasks without concerning the underlying technologies. The architecture of pervasive computing can be mainly divided into two layers: interaction layer and computing layer. The main function of interaction layer is to realize interaction process between users and pervasive environment, obtain relevant contexts and transmit users requests from pervasive devices (PDA, sensor, etc.) to lower computing platform. Meanwhile the main function of computing layer is to provide transparent computing capability to upper layer to execute users jobs or response and adjust the pervasive system according to the changing context. Obviously, it is very important to organize the massive computing resources located at largescale heterogeneous environment to provide uniform transparent computing ability to users. Therefore, task scheduling strategy is significantly important in pervasive computing system. In most cases, a pervasive computing application (or job) consists of several atomic computation sub-tasks constrained by certain processes (e.g. sequential order or parallel order). For example, in pervasive learning system, in order to provide appropriate education resources to users from massive education resources, the recommendation mechanism is necessary [5], and the main process can be described in Figure 1. From Figure 1 we can deduce that generally the pervasive applications (or jobs) can be denoted as a workflow (or task-flow) represented by directed acyclic graph (DAG) in which nodes represent application tasks and edges represent inter-task data dependencies. And the objective of task scheduling is to map these tasks onto different heterogeneous resources and order their executions so that precedence relations between tasks are satisfied and a minimum overall completion time can be obtained. However, as different with traditional parallel or distributed computing environment, computing resources in pervasive environment are located in a wide area network, meanwhile the transmission medium and communication mode between them are various, and thus not only computing capabilities but also bandwidths of these computing resources are heterogeneous. So far, most traditional task scheduling strategies are usually based on homogeneous environment in which computing capability of each node is the same, and communication bandwidths between them are always neglected or treated homogeneously. Thus, at present, there is not a sophisticated technique to achieve the goal of task scheduling perfectly in large-scale pervasive computing environment. So, how to achieve effective task scheduling becomes key challenge issue in pervasive computing environment, and closely links with the system performance. In a broad sense, the scheduling problem exists in two forms in pervasive computing systems: compile-time scheduling and run-time scheduling. In compile-time scheduling, task profiling and analytical benchmarking tools play an important role in providing the right estimation of costs on tasks and links, prior to scheduling [7]. Sometimes, the estimation of performance may not be obtained exactly due to the dynamic character of resources in pervasive computing environment. By contrast, run-time scheduling is more likely to make realistic decisions by considering the most up-to-data schedule information. Yet on the other Fig. 1. The main process of resource recommendation.

3 A NOVEL TASK SCHEDULING ALGORITHM 1285 hand this approach neglects the application graph s structure and always chooses the most appropriate solution for single task, and it is usually undermined by the unavoidable overheads about the iterative realtime information retrieval and the multi-step task scheduling and assigning, so the whole application may not be scheduled optimally. Therefore, the compiletime scheduling is considered in our work, as it cannot only avoid the run-time scheduling overheads but also can employ the overall structure information of applications effectively. Hence it can afford to use a sophisticated scheduling heuristics to generate much better scheduling results. In addition, for the problem about inexact estimation, the compile-time scheduling technique can be used to generate a good initial scheduling scheme; and then the relevant rescheduling techniques might be used to improve it at the run time of certain application for adapting the dynamic character of pervasive computing systems. In this paper, we will not focus on the rescheduling technique. Because of its extreme importance, the compiletime scheduling problem has been extensively studied and massive heuristic algorithms were proposed in the literature [6 22]. Generally, these approaches are mainly priority-based scheduling [6,8 10] and duplication-based scheduling [7,11 19]. The prioritybased scheduling is a two-step approach. In the first step, the proper priorities are assigned to tasks and the task with highest priority is chosen for scheduling. And in the second step, the most proper resource, usually the one getting the earliest finish time, is selected to execute this task. Duplication-based techniques, on the other hand, are found to be quite effective for messagepassing systems [18]. The main idea of task duplication is to duplicate the more critical tasks to some resources redundantly in order to minimize communication costs among dependent tasks and further reduce the schedule length (SL). In most cases, the combination of task duplication technique and priority-based scheduling algorithms can overcome others obviously [7,11 19], in which duplication phase can be used as additional part of the second stage of priority-based algorithm. In recent years, although several priority and duplication based heterogeneous scheduling algorithms have been proposed (such as HLD [7], LDBS [11] and HCPFD [14]), there also exists several problems within them. In task selection stage, for the purpose of solving the problem that critical path may change dynamically in scheduling process, Kwok et al. [10] used the dynamic critical path (DCP) mechanism to effectively select tasks for scheduling in the environment with unbound number of homogeneous resources. But in pervasive computing environment, the resources are heterogeneous and the number of them is bounded. Thus, the critical path may also be affected by the availability of resources. Unfortunately, the previous researches pay little attention on this issue. So if we apply these algorithms into pervasive computing systems directly, the critical path cannot be obtained accurately in scheduling process. In duplication based resource allocation stage, duplication without restriction would lead to some negative effects on the final SL. In previous duplicationbased algorithms, the reasonable trade-off between the benefit of reducing inter-task communication costs and the damage of occupying the resource available time was always neglected, which would produce a lot of ineffective duplications. Therefore, in order to reduce the overall SL, some duplication should be restricted. In Reference [17], with restricting the duplication conditions, SD algorithm could avoid the ineffective task duplications to some extent. However, it was only applicable to homogeneous environment and its simple strategy may not only hinder the further optimization but also have negative effects on the following tasks scheduling. In this paper, in order to overcome these problems, a novel task scheduling algorithm based on the dynamic critical path and effective duplication (DCPED), is proposed. In task selection stage of this algorithm, a more accurate critical path calculation method is introduced, in which the traditional tlevel and blevel [6] (sum of mean computation and communication costs along the longest directed path from the concerned task to the entry task and exit task respectively) are extended under consideration of the effect of resource availability. Meanwhile, a task selection strategy which is recursively traversing the unscheduled predecessors is presented. In duplication based resource allocation stage, all ineffective duplications are divided into two types: forbidden duplication and redundant duplication. Then a new duplication strategy is proposed to eliminate these ineffective duplications by using space compression technique and dynamic critical path length (DCPL) based evaluation technique respectively; meanwhile, effective duplications can be made as much as possible to minimizing the final schedule result. The main contributions of our works lie in: 1. Introduce a novel DCP calculation method in consideration of the effect of resource available time.

4 1286 J. LUO ET AL. 2. Present a task selection strategy with recursively traversing the unscheduled predecessors. 3. Divide all ineffective duplications into two types: forbidden duplication and redundant duplication, and propose a new duplication strategy to eliminate them, furthermore to lead much more effective duplications. The remainder of this paper is organized as follows: in the next section, we provide taxonomy of task scheduling algorithms and the related works. In Section 3, we describe in detail the schedule model and the related terminology. Section 4 elaborates and analyses our novel task scheduling algorithm based on the heterogeneous DCPED. Section 5 presents the simulation results and performance analysis. The summary of the research and the direction of future work are given in Section Related Work The renowned works available on scheduling of precedence constrained task graphs are based on, simple yet effective, priority and duplication-based heuristics like heterogeneous earliest finish time (HEFT) [6], dynamic critical path algorithm (DCP) [10], heterogeneous critical parents with fast duplicator (HCPFD) [14], selective duplication (SD) [17] and levelized duplication based scheduling (LDBS) [11]. Thereinto, the first two are just priority-based heuristics and the rest three are the priority and duplication combined heuristics. The main difference between priority-based heuristics lies in their strategies to select candidate task for scheduling. Some algorithms select the candidate task with largest blevel, while some algorithms attempt to select the candidate task following the critical path strategy. To be more specific, the HEFT (O(pt 2 ), where p denotes the resource number and t denotes the task number of DAG) algorithm assigns priorities to the tasks on the basis of blevel value. And the task with highest priority is scheduled on a processor that completes its execution at the earliest time. But this approach does not consider the critical path of DAG, therefore cannot obtain the proper priority. DCP (O(t 3 )), as a dynamic priority scheduling heuristic in homogeneous systems, selects the candidate task by recalculating the current DCP at each schedule step. So the more actual priority of certain task can be obtained. But as mentioned above, this approach neglects the resource availability. Therefore, it cannot obtain the accurate priority of tasks yet. In priority and duplication combined heuristics, the main difference between them lies in their strategies to duplicate tasks. To reduce the start time of tasks, some algorithms duplicate only the critical immediate predecessor, while some algorithms attempt to duplicate the whole immediate predecessors. To be more specific, the HCPFD algorithm (O(pt 2 )) assigns priorities to tasks on the basis of static critical path. And only the task s critical immediate predecessor is considered to duplicate. Therefore, although it has a relative low complexity, it cannot make an effective duplication. The SD (O(pt 2 d max + e)) algorithm also assigns priorities to tasks on the basis of static critical path (d max being the maximum number of immediate predecessors of tasks). In duplication based resource allocation stage, a duplication condition is proposed to eliminate the ineffective duplication: it duplicates the tasks only if the ECT(T can ) can be improved. However, this simple approach cannot eliminate the ineffective duplications completely and the multi-level predecessors are not considered to be duplicated. In additional, there is another algorithm called HLD [7] which is the extension version of SD for heterogeneous environment. LDBS (O(p 3 et 3 )) is a dynamic priority and duplication based scheduling heuristic that tasks are scheduled level by level starting from the top. In the task selection stage, the candidate task can be selected with the earliest start time. And in duplication based resource allocation stage, all immediate predecessors are considered to be duplicated at each resource to minimizing the candidate task s completion time. However, due to the levelized approach as adopted by both of the LDBS algorithms, this priority gets localized to a particular precedence level, which may not reflect the true priority for scheduling a task. And meanwhile, its duplication strategy is not very effective but more complexity. The brief characteristics of these renewed schedule algorithms are summarized in Figure 2. After analysing these algorithms, we can indicate that in task selection process, the correct priority of tasks can not be obtained due to neglecting the availability of heterogeneous resources when calculating DAG s critical path; meanwhile in the duplication process, only the immediate predecessors were considered to be duplicated and how to eliminate the ineffective duplications was always ignored. Consequently, the need of an algorithm which can accurately calculate

5 A NOVEL TASK SCHEDULING ALGORITHM 1287 Fig. 2. Some priority and duplication-based scheduling algorithms and their characteristics. the DAG s critical path to select the proper candidate task and exploit benefits of duplication with eliminating the ineffective duplications is felt great. 3. Scheduling Problem Formulation In general, the form of scheduling system model in pervasive computing environment consists of a large number of application programs submitted by users and a target computing resource environment. Many important pervasive computing applications fall into the category of workflow (or task-flow) applications, examples, Gauss elimination [6] and mean value analysis, [10] etc. Instead of the application being a single large component doing all workloads, the workflow application consists of several interacting tasks that need to be executed in a certain partial order for successful execution of the application as a whole [22]. In most cases, a certain application can be represented by a DAG, as a triple G = (T, R, C), in which vertexes represent tasks and edges represent inter-task data dependencies. Thereinto, T is the set of t partitioned tasks, R represents a partial order on the task set T such that if T i R T j, then task T i must complete its execution before T j can start, C is a t*t matrix of communication data, where C i,j is the amount of data required to be transmitted from task T i to T j if these two tasks are assigned to different resources. Without loss of generality, assume that there is only one entry task (without any parents) and one exit task (without any successors) of a DAG. Figure 3 gives the task graph structures of several simple workflow applications. The target computing resource environment consists of a set P with p heterogeneous resources connected in a fully connected topology in which all inter-resource communications are assumed to perform without contention. B is a p*p matrix of bandwidth, where B m,n is the bandwidth between P m and P n. Communication links between each resources are assumed contention free, and allowing concurrent computation and Fig. 3. The task graphs structure of different workflow applications. communication. Further, the communication overhead between two tasks scheduled on the same resource can be ignored. The objective of scheduling is to allocate each task in the task graph to a resource, and assignment of a start time so that the SL or makespan is minimized. It can be defined as Equation (6). In addition, the necessary parameters used in the algorithm are mathematically described in Figure Dynamic Critical Path and Effective Duplication-based Task Scheduling Algorithm As every priority and duplication-based schedule algorithm, the DCPED algorithm consists of two stages, the task selection stage and task duplicationbased resource allocation stage. In the first stage of DCPED, a novel and much more accurate DCP calculation method is proposed. It takes into account the available time of resources and makes an extension of the traditional tlevel and blevel. At each schedule step, the current DCPL is calculated and the candidate

6 1288 J. LUO ET AL. Fig. 4. Mathematical description of parameters. task (T can ) can be identified by using a recursion-based selection strategy. In the second stage of DCPED, all ineffective duplications are divided into two kinds: forbidden duplication and redundant duplication. And these two kinds of ineffective duplications can be eliminated meanwhile effective duplications can be made as much as possible by using space compression and DCPL evaluation techniques. Then, T can is assigned to certain resource with minimum DCPL for obtaining a better final SL Task Selection In DAG schedule problem, the critical path length [10] potentially determines the SL. Therefore, in order to minimize final SL, the tasks located at critical path should be assigned higher priority. But in scheduling process, the task graph s critical path may change dynamically. In homogeneous system, the main reason leading to this problem is that: the communication cost between two tasks can be neglected when they are scheduled onto the same resource. For the purpose of solving this problem, Kwok et al. [10] used DCP mechanism to effectively select tasks for scheduling in unbound number homogeneous resources environment. In Reference [10], after each task been scheduled, the current DCP is recalculated and then the candidate task can be selected under certain strategy. But in pervasive computing system, as a bounded number of heterogeneous resources environments, the critical path may change dynamically by two more reasons: 1. With scheduling process going on, the mean value of unscheduled tasks communication and execution cost which were used to calculate the former critical path can be substituted by the determinate value gradually. 2. The earliest start time of certain tasks might be restricted by the availability of resources. And with the process of scheduling going on, the workload assigned on the certain resource may increase gradually. Thus, in order to minimize the final SL, the DCP should be calculated by a new method to effectively grasp the dynamic changes during the scheduling process. And then, the earliest unscheduled task located on the current DCP should be selected as the preference task (T pre ) for considering to be scheduled at current step. In order to calculate the DCP and identify T pre, first of all, the length of DCP via T i can be defined as follows: Definition 1. The length of DCP via T i, which can be denoted as DCPL (T i ), which is the sum of T i s anticipated start time (AS) and anticipated remnant execution time (AR), where AR denotes the anticipated execution time from this task to T exit. Meanwhile, according to the scheduling attributes, we can find that the current DCP either passes through the ready tasks set (R) or the partial ready tasks set (PR). Thus based on Definition 1, the current global DCPL can be defined as: DPCL = max {DCPL(T i)}. Furthermore, the T pre can be T i R PR identified as the task T i which belongs to either R or PR set and possess the maximum DCPL(T i ). In pervious algorithm, the relevant AS and AR are usually represented by Tlevel(T i ) and Blevel(T i ) respectively. However, we can find that the calculation of Tlevel neglects the current resource available time.

7 A NOVEL TASK SCHEDULING ALGORITHM 1289 Fig. 6. The calculation of HD-Blevel. Fig. 5. Deviation between Tlevel(T i ) and EST(T i ). As shown in Figure 5, since there are several workloads on the certain resource, the actual start time of T i might be delayed from a to b. Hence, AST(T i ) will be much later than the anticipated data arriving time of T i s CP denoted as Tlevel(T i ). For the purpose of taking consideration the resource availability, for each ready task T i, the AS can be denoted by the mean value of EST(T i ) on each resource, as MEST(T i ). The details can be defined as follows: MEST(T i ) = 1 P (EST(T i, P 1 ) + EST(T i, P 2 ) + +EST(T i, P P )) = 1 EST(T i, P m ) (7) P m (1, P ) Extending them to partial ready task set, the dynamic anticipated start time of each ready and partial ready task in the heterogeneous environment, called HD- Tlevel(T i ), is defined. Definition 2. HD-Tlevel(T i ) is the extension of Tlevel under consideration of resource available time. The HD-Tlevel(T i ) can be calculated recursively by traversing the task graph upward from T i. The recursion equation is presented as follows: where US denote the unscheduled task set, S denotes the scheduled task set and R denotes the ready task set. On the other hand, the resource available time may also affect the anticipated remnant (AR) execution time. In pervious algorithms, the AR of T i are usually denoted by blevel(t i ), and assume that blevel value will not subject to change until T i is scheduled. However, as the number of resources is bounded and one task may be assigned to a schedule hole for execution, so it may lead to the increasing of AR. As shown in Figure 6(1), when T i is scheduled on P m, we can think that the AR of T i may not increase. But in Figure 6(2), when T i is scheduled on P m, the resource available time after T i is occupied by T 1 and T 2. So, the relevant AR value may increase. Thus, in this paper, the dynamic AR execution time for heterogeneous system about T i on P m called HD-Blevel(T i,p m ) is introduced. Definition 3. HD-Blevel(T i, P m ) denotes the dynamic AR execution time for heterogeneous system. It is an extension of blevel meanwhile the effect of resource available time is taken into consideration. As it is hard to estimate the exact remnant execution time of T i, only an approximate method is proposed in this paper. The calculation rule of ready tasks HD-Blevel can be defined as follows: Rule 1 (the calculation of HD-Blevel). Under the same situation as Figure 6(2), assume that the initial value of HD-Blevel(T i,p m ) is Blevel(T i ). When AST(T 1 ) AST(T i ) > = Blevel(T i ), the HD-Tlevel(T i ) = max{ max {AFT(T s) + C s,i }, T s pred(t i ) max {HD-Tlevel(T x) + ETC(T x ) + C x,i }} T x pred(t i ) MEST(T i ) T x US&T s S T i R (8)

8 1290 J. LUO ET AL. AR of T i will not change and HD-Blevel(T i,p m )is equal to Blevel(T i ). And when AST(T 1 ) AST(T i ) < Blevel(T i ), T 1 should be hypothetically migrated upwards as shown in Figure 6(3). Meanwhile, the HD-Blevel should be updated as: HD-Blevel(T i,p m ) = HD-Blevel(T i,p m ) + ETC(T 1 ). And the new available time slot which can be denoted as AST(T 2 ) (AFT(T 1 ) AST(T 2 ) + AST(T i )) is checked whether it is larger than Blevel(T i ). Repeat this process until the available time of new schedule hole is larger than Blevel(T i )or the last task on this resource has been hypothetically migrated. Then, the HD-Blevel of ready tasks can be obtained. Furthermore, the HD-Blevel of partial-ready tasks can be obtained in the same way. As each task can be assigned to different resource, the current dynamic AR execution time of T i can be denoted as follows: MHD-Blevel(T i ) = 1 HD-Blevel(T P i, P m ) m P (9) Therefore, the current overall DCPL of the task graph can be denoted as DCPL hete = max {HD-Tlevel(T i) T i R PR + MHD- Blevel(T i )} (10) And the current T pre can be identified as the certain task with the maximum DCPL value. However, as there may be some unscheduled predecessors of T pre, the T pre cannot be selected as the candidate task for scheduling under such situation. The pervious algorithms usually select T pre s unscheduled predecessor which has the maximum blevel value as T can for scheduling. But this approach cannot guarantee T pre for scheduling as early as possible. In this paper, a task selection strategy which is based on recursively traversing the unscheduled predecessors is present. The details are described in Figure 7. An example process of task selection strategy is described as in Figure 8(1). Assume that current global DCP is denoted by the bold line. G is the current T pre, and D, E, F are the unscheduled predecessors of T pre. First of all, calculate the DCPL of G sub(1) with A and G being the entry task and exit task respectively as Figure 8(2). And the new critical path of G sub(1) is: {A C D G}, then D is the current T pre, denoted as T pre(1). But D also has the unscheduled predecessor; therefore the recursive process will be executed. As shown in Figure 8(3), the new critical path of G sub(2), with A and D being the entry task and exit task respectively, is{a E D} and E is denoted as T pre(2). As there is not any unscheduled predecessor, E can be identified as T can at current schedule step. The overall recursive process can be denoted as dashed line in Figure 8(1). Fig. 7. The procedure of task selection algorithm. Fig. 8. The illumination of the task selection algorithm.

9 A NOVEL TASK SCHEDULING ALGORITHM 1291 Theorem 1. Assume that in task schedule process, using DCP-based task schedule method can lead to the optimal schedule result. Then selecting T pre s certain unscheduled predecessor as the current T can by using our task selection strategy can make T pre to be scheduled as early as possible. Proof. According to the task selection strategy, when all the recursive process are done (assume the recursion is called s times), the final T can is equal to T pre(s) which is the preference task in the sth G sub (as G sub(s) ). And according to the assumed conditions, it can deduce that select T pre(s) as T can can make G sub(s) s exit task (denoted as T pre(s 1) ) to be scheduled at the earliest time. According to the same reason, it can also make G sub(s 1) s exit task (denoted as T pre(s 2) )tobe scheduled at the earliest time. The rest may be deduced by analogy, it can make G sub(1) s exit task (denoted as T pre(0) ) to be scheduled at the earliest time. As T pre(0) is equal to T pre, the proof is done Task Duplication-based Resource Allocation At this stage, the resource allocation process is based on task duplication technique. Different with the traditional distributed computing environment, the communication overhead of inter-resource cannot be omitted in pervasive computing environment, and will be the major hurdle to effective execution of parallel applications. This overhead can cause a serious penalty especially in pervasive computing systems where the network bandwidths are considerable slower than the traditional situation. Since the communication cost between tasks assigned to the same resource can be negligible, task duplication is a relatively effective approach which can make effective use of the idle time slots to reduce the inter-task communication. Therefore, the main procedure of duplication-based resource allocation stage can be defined as assigning candidate task (T can ) to the most proper resource by using task duplication strategy to obtain the optimum performance (generally the earliest completion time). In task duplication process, EST(T can ) is restricted to the arriving time of its immediate predecessors. So in order to obtain the optimal schedule result by reducing the inter-task communication cost, all predecessors of the candidate task must be considered for duplicating following the descending order of data arrive time [17]. Although the duplication can reduce the inter-task communication cost, at the same time it also utilize the available time of a certain resource. Thus, duplicating without any restriction may occupy the available time of the resource which will be used for the following task execution, and may lead to a bad schedule. Consequently, in order to determine which tasks should be duplicated and which should not, the effectiveness of duplication should be taken into consideration. But unfortunately, previous works pay very little attention on this issue. In this paper, we can divided the ineffective duplications which may cause the negative effect on the final SL into two kinds: forbidden duplication which can not reduce the T can s completion time and redundant duplication which may delay the final SL even if the local completion time can be reduced. Consequently, in order to obtain the better schedule performance, the tasks need to be duplicated effectively as much as possible on the premise that these two kinds of ineffective duplication must be eliminated. Thus, in our work, we propose a two-stage effective task duplication strategy to eliminate both forbidden duplication and redundant duplication. The first stage is local optimization effective duplication. Based on eliminating forbidden duplication, the local task duplication performance can be improved further by using space compression technique. And the second stage is global optimization effective duplication. On the basis of local optimization stage, the redundant duplication can be eliminated by using DCP-based effective duplication checking standard, and furthermore, final SL can be minimized to the largest extent Local optimization-based effective duplication In order to eliminate the forbidden duplication, SD algorithm [17] adopted the width-first-based duplication strategy and denoted the basic effectiveduplication condition that only the duplication which can improves the EST(T can ) will be done. This approach can eliminate the forbidden duplication to some extend. However, the performance of SD cannot be guaranteed very well because of its overly simplified strategy. The details of the relevant drawbacks can be described as Figure 9. The schedule result without duplications is depicted in Figure 9(2), T 1 is the CP of T can, and assume that DAT(T 1,P m ) > DAT(T 2,P m ) > DAT(T 3,P m ). The width-first-based duplication result can be shown

10 1292 J. LUO ET AL. Fig. 9. (1) The topology of task graph, (2) schedule without duplication, (3) using width first strategy to duplicate, (4) consider effective multi-level predecessors duplication, (5) using space compression strategy to duplicate. in Figure 9(3). After duplicating T 1 P m, the EST(T can,p m ) is restricted by T 1 s completion time (because ACT(T 1,P m ) > DAT(T 2,P m )). Under this condition, duplicating T 2 cannot reduce EST(T can,p m ). Therefore according to pervious strategy, the duplication process is end. However the Figure 9(4) indicates that if we duplicate T 1 s critical predecessor T 4 after duplicating T 1 onto P m, ACT(T 1,P m ) can be reduced to g. Then T 2 has large enough slot to be duplicated, and EST(T can,p m ) can be further reduced to c. Consequently, the multi-level predecessors duplication should be taking into consideration on the premise of satisfying the basic effective duplication condition. The corresponding effective multi-level predecessors duplication rule can be defined as follows: Rule 2 (effective multi-level predecessors duplication). Assume that after T can s immediate predecessor (denoted as T s ) have been duplicated on P m, the EST(T can,p m ) is effected by EFT(T s,p m ), namely EST(T can,p m ) = AFT(T s,p m ). At this time, T s s CP should be considered for duplicating. This process can be repeated until the kth CP cannot satisfies the duplication conditions or entry task has been duplicated. However, following Figure 9(4) we can find another problem that: as the available time of P m has been partition into several segments, there might not be enough available time to duplicate the next candidate predecessor (here is T 3 ). Therefore, the space compression-based strategy is proposed to combine the several small available time slots into a bigger one. The so-called space compression concept means that, the duplicated predecessors could be migrated downwards without any data arrive time constrain to make them adjacent in order to splice several small distributed available time slots to a bigger one for satisfying the next duplication requirement. The rule of effective duplicated predecessors downward migration can be defined as follows: Rule 3 (effective downward migration). Assume that the current candidate duplication task is T d. Only the duplicated predecessors whose EST is later than T d should be migrated downwards respectively from top to down meanwhile the last duplicated predecessors completion time must be earlier than EST(T can, P m ) all the time, that is, AFT(T k,p m ) < EST(T can, P m ). The downward migration is terminated when a big enough available time slot has been formed or AFT(T k,p m ) = EST(T can,p m ). Theorem 2. According to rule 3, it can lead to the most effective result of the duplicated predecessors downward migration. Furthermore, if only T can s immediate predecessors are considered to be duplicated, this rule can lead to the optimal local schedule result. Proof. Without loss generality, assume that T 2 is the first one in the duplicated predecessor sequence on certain resource which starts after T d s EST; T 1 and T 3 are two duplicated predecessors located at upper and nether of T 2 respectively, which can be described as Figure 10(1). Following rule 3, in order to duplicate T d, only T 2 and later duplicated predecessors need to be migrated downwards, as shown in Figure 10(2). Here the distance of T 2 s downward migration is e d, the distance of T 3 s is Assume that downward migration from T 1 can generate the big enough available time slot to duplicate T d. The Figure 10(3) indicates that the actual distance of T 1 s downward migration is c a, and c a (b a) = ETC(T d,p m ). According to the assumption, b > a c a > ETC(T d,p m ). Correspondingly, the distance of T 2 s downward migration is c + ETC(T 1,P m ) d. Therefore,

11 A NOVEL TASK SCHEDULING ALGORITHM 1293 than the result which is considered to be optimal in the previous width-first strategies. Fig. 10. The different result of downward migration. According to this strategy, the duplication result about the scenario example is shown in Figure 10(5) and the EST (T can ) is reduced further. In summary, the local optimized task duplication strategy can completely eliminate the forbidden duplication meanwhile the local schedule performance can be optimized by effectively utilizing the resource available time to duplication more predecessors. c + ETC(T 1,P m ) d (e d) = c + ETC(T 1,P m ) (a + ETC(T 1,P m ) + ETC(T d,p m )) = c a ETC (T d,p m ) > 0. Then T 2 has a needless downward migration. Although T d s EST is smaller than Figure 10(2), according to the successive duplication, denoted as T d -next, there exist two situations. For the situation about the EST(T d -next) is earlier than a or later than h, there is no effect about the T 1 s downward migration. But between a and h, the next duplication need to migrate more duplicated task downwards, thus T 1 s migration may lead to the worse result. So T 1 s downward migration is ineffective. And for the same reason, the downward migrations of all the upper duplicated tasks are also ineffectively. 2. Assume that downward migration from T 3 can generate the big enough available time slot to duplicate T d. Figure 10(4) shows that the actual distance of T 3 s downward migration is g f > 0. But in Figure 10(2), T 3 need not to be migrated downwards. Moreover, here EST(T d,p m ) is d + ETC(T 2,P m ) > a + ETC(T 1,P m ). Therefore, not only T 3 needs to be migrated downwards in a large distance, but also the EST(T d,p m ) is delayed. Thus, downward migration from T 3 cannot obtain good performance. For the same reason to Figure 10(1), the successive duplication may be badly effected by migration starting from T 3. Thus, the downward migrations of all the nether duplicated predecessors are also ineffective. In a word, in downward migration-based duplication process, the order of duplications is based on the immediate predecessors data arriving time, thus the optimal result performance can be obtained by following the rule 3. And it can obtain a better result Global optimization-based effective duplication The previous duplication strategies mainly duplicate the certain predecessors with the evaluation criteria that T can may finish its execution as early as possible. Therefore, it can only gain the local optimal scheduling. However, the duplicated tasks may occupy several available time on certain resource. So there is a trade-off between reducing inter-task communication cost and using the available time of certain resource. Thus, although EST(T can ) can be reduced further by duplicating certain predecessor, it may delay the following unscheduled task s completion time and even the final SL. This is so-called redundant duplication. Unfortunately, in previous algorithms, how to eliminate this kind of ineffective duplications was always neglected. Consequently, in order to examine and eliminate the redundant duplication effectively, the influence of task duplication on the final SL should be taken into consideration. In our works, the DCPL calculation method which is mentioned above is adopted to evaluate the final SL here effectively. After duplicating each predecessor following the local effective duplication strategy, the current DCPL should be calculated to examine and eliminate the redundant duplication to gain the global optimized duplication scheme which can lead to the minimum final SL. But in effective duplication process, the DCPL may not be monotone decreasing. Therefore, in order to grasp the whole information about duplication process, all DCPL obtained after each local effective duplicating must be examined. Then, according to the minimum value of DCPL, the global optimized duplication scheme can be obtained and the redundant duplication will be eliminated effectively. The corresponding

12 1294 J. LUO ET AL. Fig. 11. The procedure of task duplication based resource allocation strategy. equation can be defined as Min{Min(DCPL dup (T d, P n ), DCPL no dup (P n ))} s.t. P n P&T d D (11) where D denotes the duplicated-predecessors set. Thus, the details of duplication based resource allocation strategy can be described in Figure Details of DCPED Algorithm and Scheduling Examples Analysis The DCPED scheduling algorithm is formalized in Figure 12. In DCPED algorithm, the number of the tasks belonged to R and PR set is much less than t. And in DCPL calculation process, actually most of tasks just need to make a little change based on the pervious calculation. Therefore, the complexity of DCPL calculation is about O(pt). And in task selection strategy, the recursion process should be executed t times at worst case, so the complexity of it is about O(pt 2 ). Moreover, in duplication-based resource allocation stage, we can assume that the number of duplication is about d max which is the maximum in/out degree of a task in DAG. And in DCPL calculation process after each duplicating, there is not any new ready or partial ready task to join in the R and PR set. Therefore, the complexity of this phase is about O((t + t)*d max *p). Thus, the overall worst case complexity of DCPED algorithm comes out to be O(t*(pt + pt 2 + 2d max pt)) O(pt 3 ). Generally, the DCPL calculation process in task duplication based resource allocation phase usually can determine the T pre in next schedule step, thus the actual complexity of DCPED is less than O(pt 3 ). In comparison to LDBS algorithms (O(p 3 et 3 )), complexity of DCPED algorithm is lower by an order at least. In this paper, a scenario example about complex workflow application is shown in Figure 13 to illuminate all improvements. Thereinto, its structure has several common attributes, so the various complexworkflow applications can be viewed as the derivation or the simplification of this structure. The running trace and schedule result generated by DCPED are given in Figures 14 and 15(a) in comparison to HEFT [6], E-DCP [10], HCPFD [14], LDBS [11] and HLD [7] algorithm (Figure 15(b f)) which mentioned in the related works, where E-DCP is the extension of DCP algorithm into heterogeneous environment. In DCPED s running process, scheduling T 5 onto P 1 with duplicating T 1 and T 2 will obtain the minimum completion time of T 5. But according to the global optimization strategy, scheduling T 5 P 0 without any duplication can get the minimum DCPL. Therefore, Fig. 12. The main procedure of DCPED algorithm.

13 A NOVEL TASK SCHEDULING ALGORITHM 1295 Fig. 13. The task graph of a scenario complex-workflow application and execution time matrix. Fig. 14. Running trace of DCPED algorithm. the better scheduling can be obtained. And while scheduling T 8, following the space compression-based effective duplication strategy, DCPED can migrate T 4 downwards to gain a big enough available time to duplicate T 5, so EST(T 8 ) can be reduced obviously. In general, DCPED can accurately select the candidate task and meanwhile get a better effective duplication by using local and global optimization strategy. Thus, a much better performance can be obtained; and the corresponding final SL is 23. Compared with DCPED, the other mentioned strategies cannot calculate critical path exactly and usually ignore the ineffective duplication. Therefore, the schedule performance is not very good. These final SLs are as follows: HEFT (29), E-DCP (29), HDCPD (25), LDBS (25) and HLD(25). In this scenario example, all the novel improvements mentioned in Section 4 can be reflected. Although not every complex workflow applications may cover all these characters, including any one of them can also improve the scheduling result independently; and moreover in the worst case as none of these improvements can be reflected, DCPED would be degenerated to the previous algorithms. Fig. 15. Schedule results by different strategy. 5. Simulation and Discussion In this section, the performance of DCPED algorithm is presented in comparison to the five recently proposed scheduling algorithms mentioned in Section 2. There are HLD, LDBS, HCPFD, E-DCP and HEFT algorithms. For this purpose, we consider two sets of application graphs as the workload for testing these algorithms: irregular graphs and regular graphs. The

14 1296 J. LUO ET AL. irregular graphs are generated randomly according to the relevant parameters and regular graphs which have some fixed structures are generated by following some numerical real world workflow applications such as Gauss elimination [6], fork join [18] and mean value analysis [9] Performance Metrics and Simulation Graph Generation The comparisons of these five scheduling algorithms are based on the following four metrics [6,19]: Normalized schedule length (NSL): SL NSL = min ETC(T i, R j ) R j R T i CP static (12) The denominator is the summation of the minimum computation costs of tasks located at the static CP on the certain resource. And the NSL of a graph is always more than one. The algorithm which gives the lower NSL of a graph is the better algorithm. Speedup: min ETC(T i, R j ) Speedup = R j R T i T SL (13) The numerator is the minimum sum of execution time with assigning all DAG tasks onto one single resource. Number of occurrences of better quality of schedules: The number of times that each algorithm given better, worse and equal schedule results compared to other algorithms is counted and evaluated in this paper. Algorithms running time: The running time of each algorithm with the respect to DAG size Simulation Graph Generation Aim at the generation of experimental DAG, the relevant parameters can be defined as follows: 1. DAG size n: the task number of DAG. 2. Shape: denotes the shape factor of DAG. Using this parameter, the width of each level about the task graph can be obtained, and the range of this value is (0, n shape]; 3. D+: the maximum outdegree in DAG; 4. CCR: the ratio of communication to computation cost. 5. Heterogeneity factor chf: it can reflect the extent of DAG s heterogeneity. 6. CM: denotes the mean value of computation cost. The mean value of each task s execution time can be generated by obeying uniform distribution with CM for the expectation. Furthermore, the range of ETC(T i,r j ) value is [( ( ETC(T i ) 1 chf 2 ( ( ETC(T i ) 1 + chf 2 )), ))] (14) In each simulation, the values of these parameters are selectively assigned by some certain values which will be mentioned in the following section to generate the required graphs Performance Comparison of Irregular Graphs The irregular workflow graphs are considered firstly in our simulation. In order to generate the random graphs, the values of DAG generation parameters mentioned above are assigned through the following sets as shown in Figure 16. These combinations give 1440 different DAG types. Since 10 random DAGs were generated for each DAG type, the total number of DAGs used in irregular graph simulation was around 14.4 k. The first simulation is to compare the average NSL and Speedup of six algorithms with respect to the size of DAG while CCR is 2.0 and resource number is 8, as shown in Figure 17(a) and (b). With the increment of DAG size, NSL and speedup of each algorithm is increasing. And DCPED always produces a better performance than any other algorithm. This is because Fig. 16. The setting of the randomly graphs generation parameters.

15 A NOVEL TASK SCHEDULING ALGORITHM 1297 Fig. 17. The performance of randomly generated graphs with respect to DAG size, CCR and resource number. that when DAG size is small, the difference between each algorithm s schedule results is little. With the DAG size increasing, the advantage of critical path and duplication-based algorithm is becoming obvious. In the duplication stage, due to only duplicating the single immediate predecessor, HCPFD cannot obtain a great schedule result. By contrary, LDBS and HLD could duplicate more immediate predecessors than HCPFD, so it can outperform HCPFD especially when the DAG size is large. Moreover, HLD algorithm uses critical path technique to assign tasks priority, thus it will get a better schedule result than LDBS. For E-DCP and HEFT algorithm, as without considering duplication, E-DCP with using DCP-based strategy is superior to HEFT. For DCPED algorithm, because of using the DCPED-based strategy, it can outperform any other

16 1298 J. LUO ET AL. algorithms (by 12 per cent better than HLD, 15 per cent better than LDBS, 23 per cent better than HCPFD, 31 per cent better than E-DCP, 34.5 per cent better than HEFT). The next simulation is to compare the average NSL and speedup of six algorithms with respect to the CCR value where DAG size is 160. The details are showed in Figure 17(c) and (d). It indicates that with the increment of CCR value, NSL of each algorithm is increasing but by contraries Speedup is decreasing. The reason is that due to the increasing of CCR, the communication cost is becoming the dominated part of DAG execution. For the non-duplication based algorithms, due to eliminating the communication cost, tasks which have the data constraints will be scheduled on the same resource. Therefore, it may lead to large NSL and load imbalance of resources. For the duplication-based algorithms, the predecessors of T can can be duplicated due to the decreasing of communication cost. On the other side, with the increment of CCR, the sum execution time of the tasks is fixed. Thus, the speedup will decrease drastically. And as in the first simulation, LDBS and HLD can outperform HCPFD. Under this situation, as DCPED can make good use of the resource idle time to eliminate the communication cost, it can produces a better schedule result (NSL) than HLD by 10 per cent, LDBS by 15 per cent, HCPFD by 30 per cent, E-DCP by 39 per cent and HEFT by 49 per cent. The third simulation is to compare the average NSL and speedup of six algorithms with respect to the resource number where DAG size is 200 and CCR is 2.0. The results are showed in Figure 17(e) and (f). It can indicate that with the increment of resource number, NSL of each algorithm is decreasing and by contraries speedup is increasing. But when the resource number is larger than a certain value, the NSL and speedup of each algorithm will not change. At this situation, duplication-based algorithms usually outperform the non-duplication-based algorithms. In details, DCPED can make a better schedule result any other algorithms (By 11 per cent better than HLD, 15 per cent better than LDBS, 20 per cent better than HCPFD, 26.5 per cent better than E-DCP, 30 per cent better than HEFT). Here we also measure the running times of all algorithms running on DELL 5150 PC. These times are plotted in Figure 18. At most time, HEFT and HCPFD are faster than the rest of the algorithms. The running time of DCPED algorithm is slightly larger than E-DCP and HLD algorithm. However, since the main objective of pervasive computing is to obtain Fig. 18. The running time of algorithms with respect to DAG size. a better user-centric performance, minimum schedule length is the much more important aspect; a slightly long running time in producing a much better schedule result can be acceptable. Finally, in this part of simulation, the number of times which each algorithm produced better, worse and equal result compared to every other algorithm is counted for about 14.4 k DAGs in Figure 19. A white rectangle in this figure represents the comparison results about the one of the algorithms on the left side and another algorithm on the top side. In each rectangle, there are three values representing the number of times the algorithm on the left side performed better, worse and equal schedule result compared to the algorithm on the top side Performance Comparison of Regular Graphs In addition to the irregular task graphs, we also considered regular graphs of three real world problems: Gauss elimination [6], Lobe fork join [19] and mean value analysis graph [10]. The fixed structures of these kinds of graphs are shown in Figure Gauss elimination graph The structure of the application graph is defined in Reference [6], shown in Figure 20. The number of tasks n and the number of graph levels l depend on the matrix size m. Therefore, only the execution and communication cost generation parameters were selected randomly. The total number of tasks n in Gauss

17 A NOVEL TASK SCHEDULING ALGORITHM 1299 Fig. 19. Average comparison of these algorithms in terms of better, worse and equal performance. Fig. 20. Elementary task graph structures for some regular parallel numerical workflow applications. elimination graph is equal to m2 +m 2 2. Figure 21(a) and (b) give the average NSL and speedup of each algorithm at various matrix sizes form 5 20, with the increment of 5. The smallest size graph in this simulation has 14 tasks and the largest one has 209 tasks. As the structure is fixed, it should not set the DAG sharp and in/out degree parameters, the rest of parameters are set as in Figure 16. In Gauss graph, there is a path which has the largest number of tasks. Therefore, critical path-based algorithms could produce a better performance. Meanwhile, as there are only two immediate predecessors with each task, the difference performance between duplication-based algorithms is small. Additionally, at this situation, HLD algorithm will be degenerated to HCPFD. Thus, DCPED algorithm considers both the critical and duplication, thus it can cause the most effective schedule result Fork--join graph The structure of the application graph is defined in Reference [19], shown in Figure 20. The characteristic of this kind of graphs is that the number of DAG s level is odd, where each odd level has only one task and each even level has more than one task. In the generation process, the task number of each even level should be generated. And the rest of parameters should be set as in Figure 16. Figure 21(c) and (d) give the average NSL and speedup of each algorithm with respect to DAG size from 40 to 200. In Fork join graph, as each odd level has only one task, the difference between critical path-based and non-critical path-based algorithms is small. However, there might be many tasks in each even level, and each of them has only one same immediate predecessor, so the multi-predecessors duplicationbased algorithms are more effective. Thus at this situation, HLD and LDBS will get the close schedule results and meanwhile can outperform the HCPFD. On the other side, as DCPED algorithm cannot only consider the multi-predecessors duplication like LDBS and HLD but also can use space compression technique to get more change to duplicating, it can lead to the best schedule performance Mean value graph The structure of the application graph is defined in Reference [10], shown in Figure 20. The number of the level about mean value graph is odd. The total number

18 1300 J. LUO ET AL. Fig. 21. The performance of regular graphs with respect to the matrix size. (a), (b) About Gauss elimination graphs, (c), (d) about the fork join graphs and (e), (f) about the mean value graphs. of tasks n in mean value analysis graph is equal to ([ l / 2 ]) 2. Figure 21(e) and (f) give the average NSL and speedup of each algorithm at various l / 2 values form 5 13, with the increment of 2. In the generation process, the parameters should be set as Gauss graphs. In mean value graph, there might be several parallel paths; it indicates that the critical path-based algorithm should be obtained the better performance. On the other side, liking Gauss graph, each task only has two immediate predecessors, so the difference between duplication-based algorithms is small. Under consideration about both of the dynamic critical and effective duplication, the DCPED algorithm can still be superior to any other algorithms. Additionally, liking in the Gauss elimination graphs scheduling, HLD will be degenerated to HCPFD algorithm.

19 A NOVEL TASK SCHEDULING ALGORITHM Conclusion and Future Works In this paper, for scheduling the complex workflow applications in pervasive computing environment, a novel task graph scheduling algorithm based on DCPED is presented to solve the problems of the pervious scheduling algorithms about the inaccurate calculation of critical path and ineffective duplication in pervasive environment. In DCPED, the candidate task can be obtained by using DCP strategy under consideration the resource availability. And in task duplication-based resource allocation stage, an effective task duplication algorithm is proposed to eliminate the ineffective duplications which are divided into forbidden duplication and redundant duplication respectively, and meanwhile; the more predecessors can be duplication effectively to obtain a better schedule result by using space compression and DCPL evaluation techniques. Thereinto, each of these strategies can improve the scheduling result independently, and in the worst case with no improvement being reflected, the result should be no worse than the previous algorithms. Based on the experimental study using a large set (around 14.4 k) of randomly generated graphs with various generation parameters and several real world workflow application graphs (Gauss elimination, fork join and mean value analysis) and meanwhile with a simulative heterogeneous network environment, the DCPED algorithm significantly outperformed all of the pervious algorithms in terms of performance metrics including average normalize SL, speedup and meanwhile got an acceptable running times. Therefore, the DCPED is a much more effective schedule algorithm in the pervasive computing environment especially for the large scale and fine-grain workflow applications. As mentioned in Section 1, for the purpose of solving the inexact estimation problem in compiletime scheduling, the compile and run-time combined scheduling techniques will be considered in our future works. At the initial stage, a pre-scheduling scheme will be obtained by using a sophisticated compiletime scheduling approach like DCPED. And then, at the run-time stage, the rescheduling technique will be used to adjust the former schedule scheme by using the real-time schedule information to adapt the dynamic computing environment. In the rescheduling process, in order to avoid the overheads about taking some additional time to retrieve the real-time information, to run the reschedule algorithm and to assign the task to the proper resource according to the new scheme, the task should not be rescheduled until its immediate predecessors are starting to run. Thus, the retrieval of real-time schedule information and the execution of task reschedule can be overlapped with the execution of immediate predecessors. However, as it cannot utilize the real-time execution information of the immediate predecessors, there still exists the estimation error to some extent. Meanwhile, as the development of estimation technique, the better performance knowledge estimation can be obtained. But most of current schedule algorithms only consider the snapshot value of resource performance when they make pre-scheduling estimate. Therefore, a suitable approach which can exploit the dynamic estimation information needs to be explored to obtain a good performance about the trade-off mentioned above. On the other hand, in the previous algorithms, the rescheduling process of certain task will be called instantly when the reschedule conditions mentioned above are satisfied, it will neglects the task graph structure and the whole application may not be scheduled optimally. Thus, an intermediate solution which considers both the task graph structure and the dynamic behaviour of environment need to be explored. Consequently, these new ideas may lead to a robust schedule approach to adjust the dynamism of the pervasive computing environment and may be the main direction of our future works. Acknowledgements This work is supported by National Natural Science Foundation of China under Grants No and , Jiangsu Provincial Natural Science Foundation of China under Grants No. BK , Jiangsu Provincial Key Laboratory of Network and Information Security under Grants No. BM and Key Laboratory of Computer Network and Information Integration, Ministry of Education of China, under Grants No. 93K-9. References 1. Weiser M. The computer of the 21st century[j]. Scientific American 1991; 265(3): aura/

20 1302 J. LUO ET AL. 5. Luo J, Dong F, Cao. J. Multicontext-aware resource recommendation mechanism for service-oriented ubiquitous learning environment. In Proceeding of the Third International Conference on Pervasive Computing and Applications (ICPCA 08), Topcuoglu H, Hariri S. Performance-effective and lowcomplexity task scheduling for heterogeneous computing. IEEE Transactions on Parallel and Distributed Systems 2002; 13(3): Bansala S, Kumar P, Singh K. Dealing with heterogeneity through limited duplication for scheduling precedence constrained task graphs. Journal of Parallel and Distributed Computing 2005; 65: Baskiyar S, SaiRanga P. Scheduling directed A-cyclic task graphs on heterogeneous network of workstations to minimize schedule length. In Proceeding of the 2003 International Conference on Parallel Processing Workshops (ICPPW 03), Daoud MI, Kharma N. Efficient compile-time task scheduling for heterogeneous distributed computing systems. In Proceedings of the 12th International Conference on Parallel and Distributed Systems (ICPADS 06), Kwok YK, Ahmad I. Dynamic critical-path scheduling: an effective technique for allocating task graphs to multiprocessors. IEEE Transactions on Parallel and Distributed Systems 1996; 7(5): Dogan A, Ozguner F. LDBS: a duplication based scheduling algorithm for heterogeneous computing systems. In Proceeding of the International Conference on Parallel Processing (ICPP 02), Liu CH, Li. CF. A dynamic Critical Path Duplication Task Scheduling Algorithm for Distributed Heterogeneous Computing Systems. In Proceedings of the 12th International Conference on Parallel and Distributed Systems (ICPADS 06), Maheswaran M, Siegel. HJ. A dynamic matching and scheduling algorithm for heterogeneous computing systems. In Proceedings of the Seventh Heterogeneous Computing Workshop, Hagras T, Janecek J. A high performance, low complexity algorithm for compile-time task scheduling in heterogeneous systems. In Proceeding of the 18th International Parallel and Distributed Processing Symposium (IPDPS 04), Li G, Zhang Y. Scalable duplication strategy with bounded availability of processors. In Proceedings of the 10th International Conference on Parallel and Distributed Systems (ICPADS 04), Wieczorek M, Prodan R, Fahringer T. Scheduling of scientific workflows in the ASKALON grid environment. SIGMOD Record 2005; 34(3):. 17. Bansal S, Kumar P. An improved duplication strategy for scheduling precedence constrained graphs in multi-processor systems. IEEE Transactions on Parallel and Distributed Systems 2003; 14(6): Park GL, Shirazi B. DFRN: a new approach for duplication based scheduling for distributed memory multiprocessor systems[c]. 11th International Parallel Processing Symposium (IPDPS 97), Ahmad I, Kwok. YK. On exploiting task duplication in parallel program scheduling. IEEE Transactions on Parallel and Distributed Systems 1998; 9(9): Park CI, Choe. TY. An optimal scheduling algorithm based on task duplication. IEEE Transactions on Parallel and Distributed Systems 2002; 51(4): Bajaj R, Agrawal DP. Improving scheduling of tasks in a heterogeneous environment. IEEE Transactions on Parallel and Distributed Systems 2004; 15(2): Mandal A, Kennedy K, Koelbel C. Scheduling strategies for mapping application workflows onto the grid. In Proceedings of 14th IEEE International Symposium on High performance distributed computing, 2005, HPDC-14, July 2005; Author s Biographies Junzhou Luo is currently a professor and the dean in School of Computer Science and Engineering, Southeast University, China. He received his M.S. and Ph.D. degrees in Computer Science from the Southeast University, China in 1992 and 2000, respectively. His current research interests include pervasive computing, network security, service computing and protocol engineering. Fang Dong is currently a Ph.D. student in School of Computer Science and Engineering, Southeast University, China. He received his B.S. and M.S. degrees in Computer Science from Nanjing University of Science & Technology, China in 2004 and 2006, respectively. His current research interests include pervasive computing, service computing and task scheduling. Jiuxin Cao is currently an associate professor in School of Computer Science and Engineering, Southeast University, China. He received his M.S. degree in Computer Application from Henan University of Science and Technology, China in 1999 and Ph.D. degrees in Computer Science from Xi an Jiaotong University, China in His current research interests include pervasive computing, service computing and e-learning. Aibo Song is currently an associate professor in School of Computer Science and Engineering, Southeast University, China. He received his M.S. and Ph.D. degrees in Computer Science from the Southeast University, Nanjing, China in 1996 and 2003, respectively. His current research interests include pervasive computing and grid computing.

A resource-aware scheduling algorithm with reduced task duplication on heterogeneous computing systems

A resource-aware scheduling algorithm with reduced task duplication on heterogeneous computing systems J Supercomput (2014) 68:1347 1377 DOI 10.1007/s11227-014-1090-4 A resource-aware scheduling algorithm with reduced task duplication on heterogeneous computing systems Jing Mei Kenli Li Keqin Li Published

More information

A Level-wise Priority Based Task Scheduling for Heterogeneous Systems

A Level-wise Priority Based Task Scheduling for Heterogeneous Systems International Journal of Information and Education Technology, Vol., No. 5, December A Level-wise Priority Based Task Scheduling for Heterogeneous Systems R. Eswari and S. Nickolas, Member IACSIT Abstract

More information

An Experimental Investigation into the Rank Function of the Heterogeneous Earliest Finish Time Scheduling Algorithm

An Experimental Investigation into the Rank Function of the Heterogeneous Earliest Finish Time Scheduling Algorithm An Experimental Investigation into the Rank Function of the Heterogeneous Earliest Finish Time Scheduling Algorithm Henan Zhao and Rizos Sakellariou Department of Computer Science, University of Manchester,

More information

Contention-Aware Scheduling with Task Duplication

Contention-Aware Scheduling with Task Duplication Contention-Aware Scheduling with Task Duplication Oliver Sinnen, Andrea To, Manpreet Kaur Department of Electrical and Computer Engineering, University of Auckland Private Bag 92019, Auckland 1142, New

More information

A Novel Task Scheduling Algorithm for Heterogeneous Computing

A Novel Task Scheduling Algorithm for Heterogeneous Computing A Novel Task Scheduling Algorithm for Heterogeneous Computing Vinay Kumar C. P.Katti P. C. Saxena SC&SS SC&SS SC&SS Jawaharlal Nehru University Jawaharlal Nehru University Jawaharlal Nehru University New

More information

A Modified Genetic Algorithm for Task Scheduling in Multiprocessor Systems

A Modified Genetic Algorithm for Task Scheduling in Multiprocessor Systems A Modified Genetic Algorithm for Task Scheduling in Multiprocessor Systems Yi-Hsuan Lee and Cheng Chen Department of Computer Science and Information Engineering National Chiao Tung University, Hsinchu,

More information

A Task Scheduling Method for Data Intensive Jobs in Multicore Distributed System

A Task Scheduling Method for Data Intensive Jobs in Multicore Distributed System 第一工業大学研究報告第 27 号 (2015)pp.13-17 13 A Task Scheduling Method for Data Intensive Jobs in Multicore Distributed System Kazuo Hajikano* 1 Hidehiro Kanemitsu* 2 Moo Wan Kim* 3 *1 Department of Information Technology

More information

Controlled duplication for scheduling real-time precedence tasks on heterogeneous multiprocessors

Controlled duplication for scheduling real-time precedence tasks on heterogeneous multiprocessors Controlled duplication for scheduling real-time precedence tasks on heterogeneous multiprocessors Jagpreet Singh* and Nitin Auluck Department of Computer Science & Engineering Indian Institute of Technology,

More information

CHAPTER 6 ENERGY AWARE SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT

CHAPTER 6 ENERGY AWARE SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT CHAPTER 6 ENERGY AWARE SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT This chapter discusses software based scheduling and testing. DVFS (Dynamic Voltage and Frequency Scaling) [42] based experiments have

More information

Leveraging Transitive Relations for Crowdsourced Joins*

Leveraging Transitive Relations for Crowdsourced Joins* Leveraging Transitive Relations for Crowdsourced Joins* Jiannan Wang #, Guoliang Li #, Tim Kraska, Michael J. Franklin, Jianhua Feng # # Department of Computer Science, Tsinghua University, Brown University,

More information

Mobile Cloud Multimedia Services Using Enhance Blind Online Scheduling Algorithm

Mobile Cloud Multimedia Services Using Enhance Blind Online Scheduling Algorithm Mobile Cloud Multimedia Services Using Enhance Blind Online Scheduling Algorithm Saiyad Sharik Kaji Prof.M.B.Chandak WCOEM, Nagpur RBCOE. Nagpur Department of Computer Science, Nagpur University, Nagpur-441111

More information

Critical Path Scheduling Parallel Programs on an Unbounded Number of Processors

Critical Path Scheduling Parallel Programs on an Unbounded Number of Processors Critical Path Scheduling Parallel Programs on an Unbounded Number of Processors Mourad Hakem, Franck Butelle To cite this version: Mourad Hakem, Franck Butelle. Critical Path Scheduling Parallel Programs

More information

A Framework for Space and Time Efficient Scheduling of Parallelism

A Framework for Space and Time Efficient Scheduling of Parallelism A Framework for Space and Time Efficient Scheduling of Parallelism Girija J. Narlikar Guy E. Blelloch December 996 CMU-CS-96-97 School of Computer Science Carnegie Mellon University Pittsburgh, PA 523

More information

Frequency Oriented Scheduling on Parallel Processors

Frequency Oriented Scheduling on Parallel Processors School of Mathematics and Systems Engineering Reports from MSI - Rapporter från MSI Frequency Oriented Scheduling on Parallel Processors Siqi Zhong June 2009 MSI Report 09036 Växjö University ISSN 1650-2647

More information

Scheduling on clusters and grids

Scheduling on clusters and grids Some basics on scheduling theory Grégory Mounié, Yves Robert et Denis Trystram ID-IMAG 6 mars 2006 Some basics on scheduling theory 1 Some basics on scheduling theory Notations and Definitions List scheduling

More information

Karthik Narayanan, Santosh Madiraju EEL Embedded Systems Seminar 1/41 1

Karthik Narayanan, Santosh Madiraju EEL Embedded Systems Seminar 1/41 1 Karthik Narayanan, Santosh Madiraju EEL6935 - Embedded Systems Seminar 1/41 1 Efficient Search Space Exploration for HW-SW Partitioning Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS

More information

Fast Multi-resource Allocation with Patterns in Large Scale Cloud Data Center

Fast Multi-resource Allocation with Patterns in Large Scale Cloud Data Center University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part B Faculty of Engineering and Information Sciences 2018 Fast Multi-resource Allocation with Patterns

More information

FUTURE communication networks are expected to support

FUTURE communication networks are expected to support 1146 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 13, NO 5, OCTOBER 2005 A Scalable Approach to the Partition of QoS Requirements in Unicast and Multicast Ariel Orda, Senior Member, IEEE, and Alexander Sprintson,

More information

A priority based dynamic bandwidth scheduling in SDN networks 1

A priority based dynamic bandwidth scheduling in SDN networks 1 Acta Technica 62 No. 2A/2017, 445 454 c 2017 Institute of Thermomechanics CAS, v.v.i. A priority based dynamic bandwidth scheduling in SDN networks 1 Zun Wang 2 Abstract. In order to solve the problems

More information

Stretch-Optimal Scheduling for On-Demand Data Broadcasts

Stretch-Optimal Scheduling for On-Demand Data Broadcasts Stretch-Optimal Scheduling for On-Demand Data roadcasts Yiqiong Wu and Guohong Cao Department of Computer Science & Engineering The Pennsylvania State University, University Park, PA 6 E-mail: fywu,gcaog@cse.psu.edu

More information

Energy and Performance-Aware Task Scheduling in a Mobile Cloud Computing Environment

Energy and Performance-Aware Task Scheduling in a Mobile Cloud Computing Environment 2014 IEEE International Conference on Cloud Computing Energy and Performance-Aware Task Scheduling in a Mobile Cloud Computing Environment Xue Lin, Yanzhi Wang, Qing Xie, Massoud Pedram Department of Electrical

More information

CLOUD WORKFLOW SCHEDULING BASED ON STANDARD DEVIATION OF PREDICTIVE RESOURCE AVAILABILITY

CLOUD WORKFLOW SCHEDULING BASED ON STANDARD DEVIATION OF PREDICTIVE RESOURCE AVAILABILITY DOI: http://dx.doi.org/10.26483/ijarcs.v8i7.4214 Volume 8, No. 7, July August 2017 International Journal of Advanced Research in Computer Science RESEARCH PAPER Available Online at www.ijarcs.info ISSN

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

Worst-case running time for RANDOMIZED-SELECT

Worst-case running time for RANDOMIZED-SELECT Worst-case running time for RANDOMIZED-SELECT is ), even to nd the minimum The algorithm has a linear expected running time, though, and because it is randomized, no particular input elicits the worst-case

More information

A Comparison of Task-Duplication-Based Algorithms for Scheduling Parallel Programs to Message-Passing Systems

A Comparison of Task-Duplication-Based Algorithms for Scheduling Parallel Programs to Message-Passing Systems A Comparison of Task-Duplication-Based s for Scheduling Parallel Programs to Message-Passing Systems Ishfaq Ahmad and Yu-Kwong Kwok Department of Computer Science The Hong Kong University of Science and

More information

OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI

OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI CMPE 655- MULTIPLE PROCESSOR SYSTEMS OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI What is MULTI PROCESSING?? Multiprocessing is the coordinated processing

More information

Empirical analysis of procedures that schedule unit length jobs subject to precedence constraints forming in- and out-stars

Empirical analysis of procedures that schedule unit length jobs subject to precedence constraints forming in- and out-stars Empirical analysis of procedures that schedule unit length jobs subject to precedence constraints forming in- and out-stars Samuel Tigistu Feder * Abstract This paper addresses the problem of scheduling

More information

A Duplication Based List Scheduling Genetic Algorithm for Scheduling Task on Parallel Processors

A Duplication Based List Scheduling Genetic Algorithm for Scheduling Task on Parallel Processors A Duplication Based List Scheduling Genetic Algorithm for Scheduling Task on Parallel Processors Dr. Gurvinder Singh Department of Computer Science & Engineering, Guru Nanak Dev University, Amritsar- 143001,

More information

Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, Roma, Italy

Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, Roma, Italy Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, 00142 Roma, Italy e-mail: pimassol@istat.it 1. Introduction Questions can be usually asked following specific

More information

An algorithm for Performance Analysis of Single-Source Acyclic graphs

An algorithm for Performance Analysis of Single-Source Acyclic graphs An algorithm for Performance Analysis of Single-Source Acyclic graphs Gabriele Mencagli September 26, 2011 In this document we face with the problem of exploiting the performance analysis of acyclic graphs

More information

APPROXIMATING A PARALLEL TASK SCHEDULE USING LONGEST PATH

APPROXIMATING A PARALLEL TASK SCHEDULE USING LONGEST PATH APPROXIMATING A PARALLEL TASK SCHEDULE USING LONGEST PATH Daniel Wespetal Computer Science Department University of Minnesota-Morris wesp0006@mrs.umn.edu Joel Nelson Computer Science Department University

More information

Reducing The De-linearization of Data Placement to Improve Deduplication Performance

Reducing The De-linearization of Data Placement to Improve Deduplication Performance Reducing The De-linearization of Data Placement to Improve Deduplication Performance Yujuan Tan 1, Zhichao Yan 2, Dan Feng 2, E. H.-M. Sha 1,3 1 School of Computer Science & Technology, Chongqing University

More information

New Optimal Load Allocation for Scheduling Divisible Data Grid Applications

New Optimal Load Allocation for Scheduling Divisible Data Grid Applications New Optimal Load Allocation for Scheduling Divisible Data Grid Applications M. Othman, M. Abdullah, H. Ibrahim, and S. Subramaniam Department of Communication Technology and Network, University Putra Malaysia,

More information

Scheduling for Emergency Tasks in Industrial Wireless Sensor Networks

Scheduling for Emergency Tasks in Industrial Wireless Sensor Networks sensors Article Scheduling for Emergency Tasks in Industrial Wireless Sensor Networks Changqing Xia 1, Xi Jin 1 ID, Linghe Kong 1,2 and Peng Zeng 1, * 1 Laboratory of Networked Control Systems, Shenyang

More information

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Structure Page Nos. 2.0 Introduction 4 2. Objectives 5 2.2 Metrics for Performance Evaluation 5 2.2. Running Time 2.2.2 Speed Up 2.2.3 Efficiency 2.3 Factors

More information

On the Max Coloring Problem

On the Max Coloring Problem On the Max Coloring Problem Leah Epstein Asaf Levin May 22, 2010 Abstract We consider max coloring on hereditary graph classes. The problem is defined as follows. Given a graph G = (V, E) and positive

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

Lecture 9: Load Balancing & Resource Allocation

Lecture 9: Load Balancing & Resource Allocation Lecture 9: Load Balancing & Resource Allocation Introduction Moler s law, Sullivan s theorem give upper bounds on the speed-up that can be achieved using multiple processors. But to get these need to efficiently

More information

Communication Aware Multiprocessor Binding for Shared Memory Systems

Communication Aware Multiprocessor Binding for Shared Memory Systems Communication Aware Multiprocessor Binding for Shared Memory Systems Shreya Adyanthaya, Marc Geilen, Twan Basten,2, Jeroen Voeten 2,, amon Schiffelers 3, Eindhoven niversity of Technology, Eindhoven, The

More information

Proceedings of the 2012 International Conference on Industrial Engineering and Operations Management Istanbul, Turkey, July 3 6, 2012

Proceedings of the 2012 International Conference on Industrial Engineering and Operations Management Istanbul, Turkey, July 3 6, 2012 Proceedings of the 2012 International Conference on Industrial Engineering and Operations Management Istanbul, Turkey, July 3 6, 2012 Solving Assembly Line Balancing Problem in the State of Multiple- Alternative

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

6 Distributed data management I Hashing

6 Distributed data management I Hashing 6 Distributed data management I Hashing There are two major approaches for the management of data in distributed systems: hashing and caching. The hashing approach tries to minimize the use of communication

More information

Reliability and Scheduling on Systems Subject to Failures

Reliability and Scheduling on Systems Subject to Failures Reliability and Scheduling on Systems Subject to Failures Mourad Hakem and Franck Butelle LIPN CNRS UMR 7030 Université Paris Nord Av. J.B. Clément 93430 Villetaneuse France {Mourad.Hakem,Franck.Butelle}@lipn.univ-paris3.fr

More information

SCHEDULING OF PRECEDENCE CONSTRAINED TASK GRAPHS ON MULTIPROCESSOR SYSTEMS

SCHEDULING OF PRECEDENCE CONSTRAINED TASK GRAPHS ON MULTIPROCESSOR SYSTEMS ISSN : 0973-7391 Vol. 3, No. 1, January-June 2012, pp. 233-240 SCHEDULING OF PRECEDENCE CONSTRAINED TASK GRAPHS ON MULTIPROCESSOR SYSTEMS Shailza Kamal 1, and Sukhwinder Sharma 2 1 Department of Computer

More information

LIST BASED SCHEDULING ALGORITHM FOR HETEROGENEOUS SYSYTEM

LIST BASED SCHEDULING ALGORITHM FOR HETEROGENEOUS SYSYTEM LIST BASED SCHEDULING ALGORITHM FOR HETEROGENEOUS SYSYTEM C. Subramanian 1, N.Rajkumar 2, S. Karthikeyan 3, Vinothkumar 4 1 Assoc.Professor, Department of Computer Applications, Dr. MGR Educational and

More information

Summary: Issues / Open Questions:

Summary: Issues / Open Questions: Summary: The paper introduces Transitional Locking II (TL2), a Software Transactional Memory (STM) algorithm, which tries to overcomes most of the safety and performance issues of former STM implementations.

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

Parallel Evaluation of Hopfield Neural Networks

Parallel Evaluation of Hopfield Neural Networks Parallel Evaluation of Hopfield Neural Networks Antoine Eiche, Daniel Chillet, Sebastien Pillement and Olivier Sentieys University of Rennes I / IRISA / INRIA 6 rue de Kerampont, BP 818 2232 LANNION,FRANCE

More information

Galgotias University: (U.P. India) Department of Computer Science & Applications

Galgotias University: (U.P. India) Department of Computer Science & Applications The Society of Digital Information and Wireless Communications, (ISSN: -98) A Critical-Path and Top-Level attributes based Task Scheduling Algorithm for DAG (CPTL) Nidhi Rajak, Ranjit Rajak and Anurag

More information

Single Chip Heterogeneous Multiprocessor Design

Single Chip Heterogeneous Multiprocessor Design Single Chip Heterogeneous Multiprocessor Design JoAnn M. Paul July 7, 2004 Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh, PA 15213 The Cell Phone, Circa 2010 Cell

More information

II (Sorting and) Order Statistics

II (Sorting and) Order Statistics II (Sorting and) Order Statistics Heapsort Quicksort Sorting in Linear Time Medians and Order Statistics 8 Sorting in Linear Time The sorting algorithms introduced thus far are comparison sorts Any comparison

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

n Given: n set of resources/machines M := {M 1 n satisfies constraints n minimizes objective function n Single-Stage:

n Given: n set of resources/machines M := {M 1 n satisfies constraints n minimizes objective function n Single-Stage: Scheduling Scheduling is the problem of allocating scarce resources to activities over time. [Baker 1974] Typically, planning is deciding what to do, and scheduling is deciding when to do it. Generally,

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

Part IV. Chapter 15 - Introduction to MIMD Architectures

Part IV. Chapter 15 - Introduction to MIMD Architectures D. Sima, T. J. Fountain, P. Kacsuk dvanced Computer rchitectures Part IV. Chapter 15 - Introduction to MIMD rchitectures Thread and process-level parallel architectures are typically realised by MIMD (Multiple

More information

Multiprocessor Scheduling Using Task Duplication Based Scheduling Algorithms: A Review Paper

Multiprocessor Scheduling Using Task Duplication Based Scheduling Algorithms: A Review Paper Multiprocessor Scheduling Using Task Duplication Based Scheduling Algorithms: A Review Paper Ravneet Kaur 1, Ramneek Kaur 2 Department of Computer Science Guru Nanak Dev University, Amritsar, Punjab, 143001,

More information

Prioritization scheme for QoS in IEEE e WLAN

Prioritization scheme for QoS in IEEE e WLAN Prioritization scheme for QoS in IEEE 802.11e WLAN Yakubu Suleiman Baguda a, Norsheila Fisal b a,b Department of Telematics & Communication Engineering, Faculty of Electrical Engineering Universiti Teknologi

More information

A static mapping heuristics to map parallel applications to heterogeneous computing systems

A static mapping heuristics to map parallel applications to heterogeneous computing systems CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2005; 17:1579 1605 Published online 24 June 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cpe.902

More information

Exploiting Duplication to Minimize the Execution Times of Parallel Programs on Message-Passing Systems

Exploiting Duplication to Minimize the Execution Times of Parallel Programs on Message-Passing Systems Exploiting Duplication to Minimize the Execution Times of Parallel Programs on Message-Passing Systems Yu-Kwong Kwok and Ishfaq Ahmad Department of Computer Science Hong Kong University of Science and

More information

I. INTRODUCTION DYNAMIC reconfiguration, often referred to as run-time

I. INTRODUCTION DYNAMIC reconfiguration, often referred to as run-time IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 11, NOVEMBER 2006 1189 Integrating Physical Constraints in HW-SW Partitioning for Architectures With Partial Dynamic Reconfiguration

More information

Distributed minimum spanning tree problem

Distributed minimum spanning tree problem Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with

More information

Energy-Aware Scheduling for Acyclic Synchronous Data Flows on Multiprocessors

Energy-Aware Scheduling for Acyclic Synchronous Data Flows on Multiprocessors Journal of Interconnection Networks c World Scientific Publishing Company Energy-Aware Scheduling for Acyclic Synchronous Data Flows on Multiprocessors DAWEI LI and JIE WU Department of Computer and Information

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

Energy-Constrained Scheduling of DAGs on Multi-core Processors

Energy-Constrained Scheduling of DAGs on Multi-core Processors Energy-Constrained Scheduling of DAGs on Multi-core Processors Ishfaq Ahmad 1, Roman Arora 1, Derek White 1, Vangelis Metsis 1, and Rebecca Ingram 2 1 University of Texas at Arlington, Computer Science

More information

Performing MapReduce on Data Centers with Hierarchical Structures

Performing MapReduce on Data Centers with Hierarchical Structures INT J COMPUT COMMUN, ISSN 1841-9836 Vol.7 (212), No. 3 (September), pp. 432-449 Performing MapReduce on Data Centers with Hierarchical Structures Z. Ding, D. Guo, X. Chen, X. Luo Zeliu Ding, Deke Guo,

More information

Code Compaction Using Post-Increment/Decrement Addressing Modes

Code Compaction Using Post-Increment/Decrement Addressing Modes Code Compaction Using Post-Increment/Decrement Addressing Modes Daniel Golovin and Michael De Rosa {dgolovin, mderosa}@cs.cmu.edu Abstract During computation, locality of reference is often observed, and

More information

Efficient Algorithms for Scheduling and Mapping of Parallel Programs onto Parallel Architectures

Efficient Algorithms for Scheduling and Mapping of Parallel Programs onto Parallel Architectures Efficient Algorithms for Scheduling and Mapping of Parallel Programs onto Parallel Architectures By Yu-Kwong KWOK A Thesis Presented to The Hong Kong University of Science and Technology in Partial Fulfilment

More information

Scheduling Real Time Parallel Structure on Cluster Computing with Possible Processor failures

Scheduling Real Time Parallel Structure on Cluster Computing with Possible Processor failures Scheduling Real Time Parallel Structure on Cluster Computing with Possible Processor failures Alaa Amin and Reda Ammar Computer Science and Eng. Dept. University of Connecticut Ayman El Dessouly Electronics

More information

Mapping Heuristics in Heterogeneous Computing

Mapping Heuristics in Heterogeneous Computing Mapping Heuristics in Heterogeneous Computing Alexandru Samachisa Dmitriy Bekker Multiple Processor Systems (EECC756) May 18, 2006 Dr. Shaaban Overview Introduction Mapping overview Homogenous computing

More information

Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems

Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems Tony Maciejewski, Kyle Tarplee, Ryan Friese, and Howard Jay Siegel Department of Electrical and Computer Engineering Colorado

More information

CHAPTER 6 STATISTICAL MODELING OF REAL WORLD CLOUD ENVIRONMENT FOR RELIABILITY AND ITS EFFECT ON ENERGY AND PERFORMANCE

CHAPTER 6 STATISTICAL MODELING OF REAL WORLD CLOUD ENVIRONMENT FOR RELIABILITY AND ITS EFFECT ON ENERGY AND PERFORMANCE 143 CHAPTER 6 STATISTICAL MODELING OF REAL WORLD CLOUD ENVIRONMENT FOR RELIABILITY AND ITS EFFECT ON ENERGY AND PERFORMANCE 6.1 INTRODUCTION This chapter mainly focuses on how to handle the inherent unreliability

More information

Fast optimal task graph scheduling by means of an optimized parallel A -Algorithm

Fast optimal task graph scheduling by means of an optimized parallel A -Algorithm Fast optimal task graph scheduling by means of an optimized parallel A -Algorithm Udo Hönig and Wolfram Schiffmann FernUniversität Hagen, Lehrgebiet Rechnerarchitektur, 58084 Hagen, Germany {Udo.Hoenig,

More information

An Energy Aware Edge Priority-based Scheduling Algorithm for Multiprocessor Environments

An Energy Aware Edge Priority-based Scheduling Algorithm for Multiprocessor Environments 42 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'18 An Energy Aware Edge Priority-based Scheduling Algorithm for Multiprocessor Environments Ashish Kumar Maurya, Anil Kumar Tripathi Department

More information

Nowadays data-intensive applications play a

Nowadays data-intensive applications play a Journal of Advances in Computer Engineering and Technology, 3(2) 2017 Data Replication-Based Scheduling in Cloud Computing Environment Bahareh Rahmati 1, Amir Masoud Rahmani 2 Received (2016-02-02) Accepted

More information

Metaheuristic Development Methodology. Fall 2009 Instructor: Dr. Masoud Yaghini

Metaheuristic Development Methodology. Fall 2009 Instructor: Dr. Masoud Yaghini Metaheuristic Development Methodology Fall 2009 Instructor: Dr. Masoud Yaghini Phases and Steps Phases and Steps Phase 1: Understanding Problem Step 1: State the Problem Step 2: Review of Existing Solution

More information

Provably Efficient Non-Preemptive Task Scheduling with Cilk

Provably Efficient Non-Preemptive Task Scheduling with Cilk Provably Efficient Non-Preemptive Task Scheduling with Cilk V. -Y. Vee and W.-J. Hsu School of Applied Science, Nanyang Technological University Nanyang Avenue, Singapore 639798. Abstract We consider the

More information

A NEW MILP APPROACH FOR THE FACILITY LAYOUT DESIGN PROBLEM WITH RECTANGULAR AND L/T SHAPED DEPARTMENTS

A NEW MILP APPROACH FOR THE FACILITY LAYOUT DESIGN PROBLEM WITH RECTANGULAR AND L/T SHAPED DEPARTMENTS A NEW MILP APPROACH FOR THE FACILITY LAYOUT DESIGN PROBLEM WITH RECTANGULAR AND L/T SHAPED DEPARTMENTS Yossi Bukchin Michal Tzur Dept. of Industrial Engineering, Tel Aviv University, ISRAEL Abstract In

More information

Subset sum problem and dynamic programming

Subset sum problem and dynamic programming Lecture Notes: Dynamic programming We will discuss the subset sum problem (introduced last time), and introduce the main idea of dynamic programming. We illustrate it further using a variant of the so-called

More information

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Ramin Zabih Computer Science Department Stanford University Stanford, California 94305 Abstract Bandwidth is a fundamental concept

More information

Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

More information

Scheduling. Job Shop Scheduling. Example JSP. JSP (cont.)

Scheduling. Job Shop Scheduling. Example JSP. JSP (cont.) Scheduling Scheduling is the problem of allocating scarce resources to activities over time. [Baker 1974] Typically, planning is deciding what to do, and scheduling is deciding when to do it. Generally,

More information

Employment of Multiple Algorithms for Optimal Path-based Test Selection Strategy. Miroslav Bures and Bestoun S. Ahmed

Employment of Multiple Algorithms for Optimal Path-based Test Selection Strategy. Miroslav Bures and Bestoun S. Ahmed 1 Employment of Multiple Algorithms for Optimal Path-based Test Selection Strategy Miroslav Bures and Bestoun S. Ahmed arxiv:1802.08005v1 [cs.se] 22 Feb 2018 Abstract Executing various sequences of system

More information

Multi-path based Algorithms for Data Transfer in the Grid Environment

Multi-path based Algorithms for Data Transfer in the Grid Environment New Generation Computing, 28(2010)129-136 Ohmsha, Ltd. and Springer Multi-path based Algorithms for Data Transfer in the Grid Environment Muzhou XIONG 1,2, Dan CHEN 2,3, Hai JIN 1 and Song WU 1 1 School

More information

Job-shop scheduling with limited capacity buffers

Job-shop scheduling with limited capacity buffers Job-shop scheduling with limited capacity buffers Peter Brucker, Silvia Heitmann University of Osnabrück, Department of Mathematics/Informatics Albrechtstr. 28, D-49069 Osnabrück, Germany {peter,sheitman}@mathematik.uni-osnabrueck.de

More information

1 Multiprocessors. 1.1 Kinds of Processes. COMP 242 Class Notes Section 9: Multiprocessor Operating Systems

1 Multiprocessors. 1.1 Kinds of Processes. COMP 242 Class Notes Section 9: Multiprocessor Operating Systems COMP 242 Class Notes Section 9: Multiprocessor Operating Systems 1 Multiprocessors As we saw earlier, a multiprocessor consists of several processors sharing a common memory. The memory is typically divided

More information

Optimal Coding of Multi-layer and Multi-version Video Streams

Optimal Coding of Multi-layer and Multi-version Video Streams Optimal Coding of Multi-layer and Multi-version Video Streams 1 Cheng-Hsin Hsu and Mohamed Hefeeda School of Computing Science Simon Fraser University Surrey, BC, Canada {cha16, mhefeeda}@cs.sfu.ca Technical

More information

Scheduling Strategies for Processing Continuous Queries Over Streams

Scheduling Strategies for Processing Continuous Queries Over Streams Department of Computer Science and Engineering University of Texas at Arlington Arlington, TX 76019 Scheduling Strategies for Processing Continuous Queries Over Streams Qingchun Jiang, Sharma Chakravarthy

More information

Proactive Economical Task Scheduling Algorithm for Grid

Proactive Economical Task Scheduling Algorithm for Grid Proactive Economical Task Scheduling Algorithm for Grid Piyush Chauhan, Nitin, Rashmi Sharma and Ved Prakash Bhardwaj Department of Computer Science & Engineering and Information Technology, Jaypee University

More information

A Deterministic Dynamic Programming Approach for Optimization Problem with Quadratic Objective Function and Linear Constraints

A Deterministic Dynamic Programming Approach for Optimization Problem with Quadratic Objective Function and Linear Constraints A Deterministic Dynamic Programming Approach for Optimization Problem with Quadratic Objective Function and Linear Constraints S. Kavitha, Nirmala P. Ratchagar International Science Index, Mathematical

More information

Branch-and-Bound Algorithms for Constrained Paths and Path Pairs and Their Application to Transparent WDM Networks

Branch-and-Bound Algorithms for Constrained Paths and Path Pairs and Their Application to Transparent WDM Networks Branch-and-Bound Algorithms for Constrained Paths and Path Pairs and Their Application to Transparent WDM Networks Franz Rambach Student of the TUM Telephone: 0049 89 12308564 Email: rambach@in.tum.de

More information

Chapter 2: Number Systems

Chapter 2: Number Systems Chapter 2: Number Systems Logic circuits are used to generate and transmit 1s and 0s to compute and convey information. This two-valued number system is called binary. As presented earlier, there are many

More information

Improving Connectivity via Relays Deployment in Wireless Sensor Networks

Improving Connectivity via Relays Deployment in Wireless Sensor Networks Improving Connectivity via Relays Deployment in Wireless Sensor Networks Ahmed S. Ibrahim, Karim G. Seddik, and K. J. Ray Liu Department of Electrical and Computer Engineering, and Institute for Systems

More information

CHAPTER 5 PROPAGATION DELAY

CHAPTER 5 PROPAGATION DELAY 98 CHAPTER 5 PROPAGATION DELAY Underwater wireless sensor networks deployed of sensor nodes with sensing, forwarding and processing abilities that operate in underwater. In this environment brought challenges,

More information

Guaranteeing Heterogeneous Bandwidth Demand in Multitenant Data Center Networks

Guaranteeing Heterogeneous Bandwidth Demand in Multitenant Data Center Networks 1648 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 23, NO. 5, OCTOBER 2015 Guaranteeing Heterogeneous Bandwidth Demand in Multitenant Data Center Networks Dan Li, Member, IEEE, Jing Zhu, Jianping Wu, Fellow,

More information

Figure : Example Precedence Graph

Figure : Example Precedence Graph CS787: Advanced Algorithms Topic: Scheduling with Precedence Constraints Presenter(s): James Jolly, Pratima Kolan 17.5.1 Motivation 17.5.1.1 Objective Consider the problem of scheduling a collection of

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Given an NP-hard problem, what should be done? Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one of three desired features. Solve problem to optimality.

More information

CSE 421 Applications of DFS(?) Topological sort

CSE 421 Applications of DFS(?) Topological sort CSE 421 Applications of DFS(?) Topological sort Yin Tat Lee 1 Precedence Constraints In a directed graph, an edge (i, j) means task i must occur before task j. Applications Course prerequisite: course

More information

09/28/2015. Problem Rearrange the elements in an array so that they appear in reverse order.

09/28/2015. Problem Rearrange the elements in an array so that they appear in reverse order. Unit 4 The array is a powerful that is widely used in computing. Arrays provide a special way of sorting or organizing data in a computer s memory. The power of the array is largely derived from the fact

More information

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL., NO., MONTH YEAR 1

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL., NO., MONTH YEAR 1 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL., NO., MONTH YEAR 1 An Efficient Approach to Non-dominated Sorting for Evolutionary Multi-objective Optimization Xingyi Zhang, Ye Tian, Ran Cheng, and

More information

Structural Advantages for Ant Colony Optimisation Inherent in Permutation Scheduling Problems

Structural Advantages for Ant Colony Optimisation Inherent in Permutation Scheduling Problems Structural Advantages for Ant Colony Optimisation Inherent in Permutation Scheduling Problems James Montgomery No Institute Given Abstract. When using a constructive search algorithm, solutions to scheduling

More information