An Approach to Mapping Scientific Workflow in Cloud Computing data centers to Minimize Costs of Workflow Execution

Similar documents
Nowadays data-intensive applications play a

A Cleanup Algorithm for Implementing Storage Constraints in Scientific Workflow Executions

EFFICIENT ALLOCATION OF DYNAMIC RESOURCES IN A CLOUD

Task Decomposition for Adaptive Data Staging in Workflows for Distributed Environments

Integrated IoT and Cloud Environment for Fingerprint Recognition

WORKFLOW ENGINE FOR CLOUDS

CLOUD WORKFLOW SCHEDULING BASED ON STANDARD DEVIATION OF PREDICTIVE RESOURCE AVAILABILITY

OPTIMIZED TASK ALLOCATION IN SENSOR NETWORKS

QUT Digital Repository:

The Study of Genetic Algorithm-based Task Scheduling for Cloud Computing

TPS: A Task Placement Strategy for Big Data Workflows

A New Approach to Ant Colony to Load Balancing in Cloud Computing Environment

Particle Swarm Optimization Approach with Parameter-wise Hill-climbing Heuristic for Task Allocation of Workflow Applications on the Cloud

QOS BASED SCHEDULING OF WORKFLOWS IN CLOUD COMPUTING UPNP ARCHITECTURE

Multi-objective Heuristic for Workflow Scheduling on Grids

SCHEDULING WORKFLOWS WITH BUDGET CONSTRAINTS

Genetic algorithm based on number of children and height task for multiprocessor task Scheduling

Load Balancing in Cloud Computing Priya Bag 1 Rakesh Patel 2 Vivek Yadav 3

A Partial Critical Path Based Approach for Grid Workflow Scheduling

A QoS Load Balancing Scheduling Algorithm in Cloud Environment

Combinatorial Double Auction Winner Determination in Cloud Computing using Hybrid Genetic and Simulated Annealing Algorithm

Mapping a group of jobs in the error recovery of the Grid-based workflow within SLA context

Efficient Task Scheduling Algorithms for Cloud Computing Environment

A Heuristic Algorithm for Designing Logical Topologies in Packet Networks with Wavelength Routing

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

Experimental Model for Load Balancing in Cloud Computing Using Equally Spread Current Execution Load Algorithm

New Optimal Load Allocation for Scheduling Divisible Data Grid Applications

Scientific Workflow Scheduling with Provenance Support in Multisite Cloud

Introduction and Simulation of Modified Left Algorithms to Attribute Orthogonal Codes in 3 rd Generation Systems

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Research on Heterogeneous Communication Network for Power Distribution Automation

An Exploration of Multi-Objective Scientific Workflow Scheduling in Cloud

Energy Efficient in Cloud Computing

Novel Cluster Based Routing Protocol in Wireless Sensor Networks

A Level-wise Priority Based Task Scheduling for Heterogeneous Systems

Efficient Technique for Allocation of Processing Elements to Virtual Machines in Cloud Environment

Link Lifetime Prediction in Mobile Ad-Hoc Network Using Curve Fitting Method

A Hybrid Genetic Algorithm for the Distributed Permutation Flowshop Scheduling Problem Yan Li 1, a*, Zhigang Chen 2, b

Scheduling Scientific Workflows using Imperialist Competitive Algorithm

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT

Resource discovery in Grid computing using Fuzzy Logic and Tabu Table

Department of CSE, K L University, Vaddeswaram, Guntur, A.P, India 3.

A priority based dynamic bandwidth scheduling in SDN networks 1

D DAVID PUBLISHING. Big Data; Definition and Challenges. 1. Introduction. Shirin Abbasi

Solving the Graph Bisection Problem with Imperialist Competitive Algorithm

Analysis of Task Scheduling Algorithm in Fine-Grained and Course-Grained Dags in Cloud Environment

An Experimental Cloud Resource Broker System for Virtual Application Control with VM Allocation Scheme

A Study of Cloud Computing Scheduling Algorithm Based on Task Decomposition

PERFORMANCE ANALYSIS BASED RESOURCE ALLOCATION, DAFT AND PAYMENT MINIMIZATION FOR CLOUD COMPUTING

Keywords: Load balancing, Honey bee Algorithm, Execution time, response time, cost evaluation.

Optimization of Multi-server Configuration for Profit Maximization using M/M/m Queuing Model

Running Data-Intensive Scientific Workflows in the Cloud

A Time-To-Live Based Reservation Algorithm on Fully Decentralized Resource Discovery in Grid Computing

Introducing a Routing Protocol Based on Fuzzy Logic in Wireless Sensor Networks

API TESTING TOOL IN CLOUD

Grid Scheduling Strategy using GA (GSSGA)

BDAP: A Big Data Placement Strategy for Cloud- Based Scientific Workflows

Integration of Workflow Partitioning and Resource Provisioning

Cost-driven scheduling of grid workflows using partial critical paths Abrishami, S.; Naghibzadeh, M.; Epema, D.H.J.

An Energy Aware Edge Priority-based Scheduling Algorithm for Multiprocessor Environments

Reference Point Based Evolutionary Approach for Workflow Grid Scheduling

An Energy Efficient Data Placement Technique for Cloud based Storage Area Network

A Rank-based Hybrid Algorithm for Scheduling Dataand Computation-intensive Jobs in Grid Environments

Cloud Resource Allocation as Preemptive Scheduling Approach

Controlled duplication for scheduling real-time precedence tasks on heterogeneous multiprocessors

Study of Load Balancing Schemes over a Video on Demand System

Resource Allocation for Video Transcoding in the Multimedia Cloud

International Journal of Scientific & Engineering Research Volume 9, Issue 3, March-2018 ISSN

DYNAMIC LOAD BALANCING FOR CLOUD PARTITION IN PUBLIC CLOUD MODEL USING VISTA SCHEDULER ALGORITHM

arxiv: v1 [cs.ne] 19 Feb 2013

Distributed System Framework for Mobile Cloud Computing

Workflow Partitioning and Deployment on the Cloud using Orchestra

3 No-Wait Job Shops with Variable Processing Times

Notes. Video Game AI: Lecture 5 Planning for Pathfinding. Lecture Overview. Knowledge vs Search. Jonathan Schaeffer this Friday

Target Tracking Based on Mean Shift and KALMAN Filter with Kernel Histogram Filtering

A hybrid algorithm for grid task scheduling problem

An Energy Efficient Coverage Method for Clustered Wireless Sensor Networks

Lecture 7. Transform-and-Conquer

Double Threshold Based Load Balancing Approach by Using VM Migration for the Cloud Computing Environment

An Improved Task Scheduling Algorithm based on Max-min for Cloud Computing

A Budget Constrained Scheduling Algorithm for Workflow Applications

Galgotias University: (U.P. India) Department of Computer Science & Applications

EXAMINING OF RECONFIGURATION AND REROUTING APPROACHES: WDM NETWORKS

A New Intelligent Method in Brokers to Improve Resource Recovery Methods in Grid Computing Network

Energy-Constrained Scheduling of DAGs on Multi-core Processors

A Level-Wise Load Balanced Scientific Workflow Execution Optimization using NSGA-II

Load Balancing Algorithm over a Distributed Cloud Network

A Novel Task Scheduling Algorithm for Heterogeneous Computing

QoS-constrained List Scheduling Heuristics for Parallel Applications on Grids

Multicast Routing with Load Balancing in Multi-Channel Multi-Radio Wireless Mesh Networks

Global Journal of Engineering Science and Research Management

MODIFIED VERTICAL HANDOFF DECISION ALGORITHM FOR IMPROVING QOS METRICS IN HETEROGENEOUS NETWORKS

A STUDY OF BNP PARALLEL TASK SCHEDULING ALGORITHMS METRIC S FOR DISTRIBUTED DATABASE SYSTEM Manik Sharma 1, Dr. Gurdev Singh 2 and Harsimran Kaur 3

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs

Tasks Scheduling using Ant Colony Optimization

SERVICE COMPOSITIONS IN PROBLEMS OF URBAN PLANNING Roman K. Fedorov (1), Alexander S. Shumilov (2)

SURVEY PAPER ON CLOUD COMPUTING

An Improved Heft Algorithm Using Multi- Criterian Resource Factors

EFFICIENT ROUTING OF LOAD BALANCING IN GRID COMPUTING

An Optimized Virtual Machine Migration Algorithm for Energy Efficient Data Centers

Transcription:

An Approach to Mapping Scientific Workflow in Cloud Computing data centers to Minimize Costs of Workflow Execution A. Zareie M.M. Pedram M. Kelarestaghi A. kosari Computer Engineering Department, Islamic Azad University-Arak Branch, Arak, Iran. ahmadzr8@gmail.co m Computer Engineering Department, Tarbiat Moallem University, Karaj/ Tehran, Iran. pedram@tmu.ac.ir Computer Engineering Department, Tarbiat Moallem University, Karaj/ Tehran, Iran. kelarestaghi@tmu.ac. ir Computer Engineering Department, Islamic Azad Universityscience and research Branch, tehran, Iran. a_kosari1983@srbia u.ac.ir Abstract: Execution of workflow application usually requires powerful-giant computing and storage resources. As consequence of expansion of distributed systems, especially cloud computing systems, an appropriate context has been provided for execution of such application. The interconnection between tasks and data commonality and transfer for the completion of computing cycle has become a major challenge. Therefore, an appropriate schedule which defines data distribution and Tasks order in distributed system is required for effective execution of Workflow application. In this paper, an approach to mapping workflow s data and tasks between cloud system data centers has been presented. This approach, have been paid enough work to appropriately map tasks and data between data centers in such a way that the total time for task execution and data movement becomes minimal. In other words, the goal of mentioned approach is to present a trade-off between these 2 Goals. Simulations have been demonstrated that the said approach can fulfill stated goals effectively. Keywords: Cloud computing, scientific Workflow, scheduling, data management, parallel processing 1. Introduction a large amount of data is produced during execution of workflow application. In one hand, executing tasks in a series takes a lot of time, and in other hand because of high amount of data in this type of 1627

applications, they are executed in parallel form on distributed system. In this condition, all necessary data has to be collected in a system for the start of each task. Hence, execution of Workflow application bears some data movement. Cost of communication is a considerable concern in distributed systems, especially in scientific and commercial Workflow applications which require a large amount of data to be stored and transferred in distributed data centers [1]. One of the new distributed systems is cloud computing. In which services are delivered according to external users needs via internet [2]. In fact, cloud computing system provides fast access to large number of complicated and distributed computers throughout the world. Users can have access to their data and applications at any time and at any place in different economic, commercial and scientific fields and obviate or decrease costs related to storage and computation in their private sites. Could computing system is comprised of some geographically distributed data centers that each includes some computing and storage servers and other physical devices for dynamic provision based upon needs[3]. Considering to strength of scalability and high processing, cloud computing system is an appropriate environment for execution of scientific workflow application. It should be noted that scientific Workflow application can widely be used by benefiting from advantages of cloud computing. But, this work has some challenges to meet. They areas following: 1. Movement of large amount of data and hurtle between data center communication path 2. The need for tasks distributions mechanism for the purpose of decreasing total time of Workflow We have tried to provide an approach for decreasing data movement and total completion time of Workflow by using critical path technic. Scheduling of tasks is an important challenge in distributed systems and according to this issue, different approaches have been presented. In 2007, Kamer presented an approach to scheduling independent tasks in a distributed system in which the only relationship between tasks is the need to access to common files and there is no dependence between tasks [4]. In [8], a genetic algorithm has been presented for decreasing time of execution of one task graph and load balancing in systems. In [9], an approach to scheduling tasks has been stated for the purpose of fulfillment of task completion deadlines and optimization of execution costs. For scheduling tasks of Workflow, approach [10] has been suggested to decrease length of schedule and promoting reliability of application execution in heterogeneously distributed environment. In [11], a method for scheduling tasks of Workflow in processor with different capabilities has been provided. Of other mapping approaches, [12, 13] cab be pointed out that allocation of independent tasks in heterogeneously distributed environment is performed to boost reliability. Different placement strategies have been presented for data placement in distributed system. In [5], a schedule has been presented for grid system in which data placemen activities can be queued, scheduled 1628

and managed. Also, in [14, 15] strategies for data mapping in distributed environment have been presented. Kosar in [7] presented a method for data placement in widely distributed systems. we intend to present an approach for Mapping Scientific Workflow in Cloud Computing Environment to Minimize Data movement and schedule length of Workflow. 2. Problem Analysis Workflow application is a sample of graphs of precedence of task execution. Such application contains many tasks. A Directed Acyclic Graph is used to indicate tasks priorities and also the amount of data which is to be transferred between them. Workflow graph has been shown as following: In which: T= and is tasks (activities) collection shown as node and related task execution time is mapped in it and shown by. R is representative of a set of edges which shows set of dependence and priorities between tasks. edge means that a data produced by task is required during execution of. Therefore, is prior to. Weight of the edge is representative of amount of this data and is shown with c.duration of time which is taken for received the data that produced by to is shown with It means, is considered as network band width delay factor. If R and and are mapped in same center, there is no need for data movement of these data between centers. Therefore, is considered zero and consequently delay time between end of and start becomes zero. A task node can be commenced to execute when the execution related to parental nodes has been completed and all required data should be gathered or collected in one data center. In this work, node with no parent is called an input node denoted by fn, and a node with no child is called an exit node denoted by ln. Fig.1 shows a sample of Workflow graph. 1629

Figure 1- a sample of Workflow application graph As it has been mentioned, cloud computing system bears some distributed data centers. is applied for showing K th data center and its storage capacity is shown with. If task is selected for mapping on center, it is expressed as map. Fig. 2 is a topology of cloud-computing used in our study. The topology is composed of two layers ([Cheng et al., 2009], [Halsall, 1995] and [Pei et al., 1999]), as follows: (1) The nodes in Layer-A must receive the request from users of different types of applications. Therefore, the nodes of Layer-A have higher computational capability than the nodes in Layer-B. In addition, the node in Layer-A must compute enormous amounts data. In addition, the nodes of Layer- A can communicate with others in the same group directly. (2) The nodes in Layer-B, where are processing node. 1630

Figure 2 - Example of topology of cloud-computing. That workflow application receive by nodes in Layer-A and mapping by those on data centers in layer-b. The purpose of problem is to mapping tasks so that: Minimized the amount of transferred date between data center ; it means that: where And minimized The time for execution of Workflow. If task start is shown as and its ending time is shown with, the second purpose of problem is stated as following: (2) In next section we description our mapping approach. 3. description our mapping approach. Before determining and introducing presented algorithm, some terms should be introduced: 1631

Earliest Start Time: Earliest start time implies to earliest time (the most possible time) for starting of task that is shown with and is calculated as following: where pt (3) Where Pt contains all paths between and fn input nodes and pt-num is representative of number of elements of this set. In addition, the whole nodes in each path have been determined with and number of elements of this set has also been determined with. Path: If it is possible to reach to from through a sequence from one or several edges and vice versa, then it is stated that and are on the same path and they are stated as. Also, the path that bears the total weights of nodes and edges is called critical path. Algorithm phases will be discussed later. 4. Presented Algorithm In presented algorithm, tasks are divided into some groups in which in each group task performance order have been determined. Then, the time for data movement and necessary storage space for execution of each of the group tasks will be defined. According to the most necessary space required for group, it is mapped in one of the available data center. If, the most required group space is so much as to amount which cannot be mapped in none of the data centers, there is obligation for moving some stored data from group to another. Phase levels of such algorithm can be stated as following: 1- Earliest start time for all graph tasks has to be calculated for start to execution. 2- Critical path between fn node and ln node should be found. Nodes through the path should be placed in. 3- Es for all tasks should be computed again since they have produced group. 4- If there is gap between ending of task and start of next task, an optimum node should be selected for placement in the gap range from succeed( ) and predecessor( or less than gap length. Optimum node is determined as following: where And ) tasks that its execution time is equal (6) 1632

According to above shown relation optimum node is selected and it is placed in considered empty space (gap). 5- A Sub-graph should be created for each category of residual connected nodes (if sub-graph includes several fn and ln nodes, nodes which most critical path are stretched between them should be determined as fn and ln). Then, step 1 to 5 of algorithm placement should be performed in each sub-graph and next group should be determined. This cycle continues until all nodes in Workflow graph are grouped. 6- In next level, the time for moving data between created groups is determined as following relation: The procedure of groups determination is in manner which each group has more load than next group since each graph converts to a smaller sub-graph at any level. Therefore, we have tried to make the situation in such a way that if data is to be transferred from one group to another group with higher index, such transfer is performed as fast as possible in one hand and if data is to be transferred to a group with lower index, such transfer is delayed as much as possible. In this condition, storing load balance to some extent. If and task exist in one group, its median data should not be transferred. Therefore, transfer time has been shown with -1. 7- The amount of required storage space should be determined during the execution of each of group tasks in this level. The amount of storage load of during execution of task is computed as below: = = In the above shown relation, refers to data that are necessary during execution of. 1633

If one task group is placed before task it is shown with. If also task is placed after task, it is shown with a. Therefore, is due to median data that is produced in group tasks before and they are required for next tasks after in the group. implies to median data that are produced with itself and they should be stored for next tasks. is the result of median data that were produced by other groups tasks and they are needed for tasks after in group, that transferred to group before execution of task. Also which has been produced by tasks before is resulted from data in group, that transferred to another group after execution of task. 8- According to maximum load in each group, if any data center that can supply such storage space exists, the group should mapped in it, but if such data center does not exist, some data which are more than available capacity of data center should be transferred to groups with lowest storage load and when they are needed, they should be returned to their group. Selection data for Transferring should be done according to following: If has highest storage load in group: 8-1- First, following data should be selected for transferring to group with lowest amount of load and repetitive return to its group. (9) The above shown inequation was selected because data which has been produced with and is required by should be transferred to another group and return repetitively without making any delay in task start time. In above inequation, max has been used because this transfer decreases load as well as other tasks loads between and. 1634

8-2- If thus, task load does not reach to lower amount of available data center, inequation obligation in relation (9) should be deleted and other data should be selected for transfer. pseudo of this phase of algorithm is as following: Placement phase of mapping algorithm 1 Insert all task of DAG in to set and k=0 2 While is task in that don t grouping do 2-1 calculate earliest start time and float time for all in 2-2 find critical path between fn and ln nodes 2-3 place tasks in critical path in the 2-4 calculate earliest start time and float time for all in 2-5 for r=1 to.count-1 do 2-5-1 if 2-5-1-1 find h that where 2-5-1-2 place in between and 2-6 for connected tasks that don t grouping create sub graph[s] and done step 2 to 11 for k=k+1 3 for each and that is in and is in and do 3-1 if 3-1-1 transfer_time 3-2 else if 3-2-1 transfer_time 3-3 else 3-3-1 transfer_time 4 for each task in DAG calculate 5 for each group find max 6 for each group do 6-1 b=true 6-2 if 6-2-1 map(group, Best data center) 6-3 else And 6-3-1 while and Do 6-3-1-1 find And transfer max 6-3-1-2 if 6-3-1-2-1 B=false; to group with min load that 1635

6-3-2 if(!b) 6-3-2-1 while Do 6-3-2-1-1 find And transfer max 6-3-3 map(group, Best data center) to group with min load that Figure 3- pseudo of Placement phase of mapping algorithm In lines related to part two of this phase, graph tasks are placed in groups and the order of each group tasks are also determined. In lines related to part three of phase, data movement time which should be transferred between groups are determined. In line 5, it is determined what group task has the highest amount of storage load at the time of execution in each group. In lines related to part 6 of phase, each group mapped to an available data center. In this part, BEST DATA CENTER implies to data center which has the smallest storage space for storage in a manner that it supplies maximum need for group (i.e. max(s_load)). Only a percentage of such storing space of data center should be taken into consideration for execution of workflow application according to possibility of results production which require being permanently stored in data center or other applications that require to be executed in this data center. 5. Experiments and results 5.1. Simulation strategies Evaluation of presented algorithm was made by using C# programming language. Since no standard test instances are available to test the presented Approach, a DAG graph random generator is developed to generate application DAG with different scales and different structures and algorithm were executed in such graphs, then results were shown in graphs by using Matlab. After that, for evaluating algorithm efficiency and data movement amount, it was compared with strategy that schedule tasks to the data centre which has the most data they require and store the produced data in the local data centre (i.e. where they were generated). This strategy represents the traditional data placement strategies in old distributed computing systems (i.e. clusters and early grid systems) [6]. Normal schedule length shown below has been applied for comparing schedule length [13]: NSL= Where; 1636

is total time for execution of Workflow is duration of task execution(completion) According to the fact that demonstrates total time of execution of critical path nodes and such amount is representative of low limit for schedule length, NSL is always greater than or equal to 1. Normal data movement is defined as following for comparing movement amount: NDM= Where, Data movement shows data movement amount due to execution of algorithm. Shows total amount of median data in Workflow application. In other words, shows maximum amount of movement. Therefore, always NDM 1 0 In shown results, all numbers in graphs are the average results of algorithm execution in 10 different graphs with the same parameters. 5.2 Simulation results: Algorithms were executed in graphs with the same parameters in first simulation. Then, results of two algorithms affected by change in branch factor between tasks and data were shown. Fig. 4 shows NDM and NSL amounts for two algorithms. 1637

Figure 4 - comparison of two approaches affected by change in coefficient of link between workflow s tasks and data Based upon results, presented algorithm considerably decreases NSL and NDM. The presented algorithm works based on critical path. Critical path is affected by increase in coefficient of branch. Therefore, little changes will be made to NSL, when branch factor is equal to 2 the traditional data placement strategies have lower NDM but the more the coefficient of branch becomes higher, this strategies have higher NDM since increase median data and data centers limit storage space. In next simulation, algorithms were executed in graphs with the same parameters and different tasks numbers. NDM and NSL produced from the said two algorithms have been shown in Fig.5: Figure 5 - Comparison between two approaches based on change in number of Workflow tasks 1638

As it is deducted from diagrams shown in fig.5, in compare of traditional data placement strategies, our algorithm presents better NDM and NSL with increase in number of Tasks. With increase workflow s tasks the tasks waiting for receiving server in traditional data placement strategies becomes higher since schedule tasks to the data centre which has the most data they require. 6. Conclusion Tasks scheduling and data placement is one of the important issues for execution of applications including a sequence of tasks in distributed system. Algorithms that are used for mapping tasks and data of scientific Workflow application should avoid data movement and transferring as far as possible according to payment of computational, storage and network expenses based upon usage, geographical distribution and internet dependence of cloud computational system. This issue provides some advantages such as decrease in network traffic, decrease in completion time and application return and also decrease in costs and expenses of using network connection lines for users and improvement in quality of providing service and reliability (considering to possibility of failure in connection lines or data annihilation between these lines) to people who apply cloud system. In addition, task execution should be scheduled in such a way that (as far as possible) the time for total completion of application becomes minimal. It tends to decrease in using computational services and storage which conduces to users and clients satisfaction. Therefore, we presented an approach that fulfills two purposes. Experiment results are indicative of successfulness of presented approach. In this work, data duplication technique has not been used by us. That s because of increase in costs and expenses of using cloud system. References [1] E. Deelman, A. Chervenak, Data management challenges of data-intensive scientific workflows, in: IEEE International Symposium on Cluster Computing and the Grid, (2008), pp. 687 692. [2] R. Buyya, C. Yeo, S. Venugopal, J. Broberg, I. Brandic, Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility, Future Generation Computer Systems 25(6)(2009),pp. 599-616. 1639

[3] R. Maggiani, Cloud computing is changing how we communicate, IEEE International Professional Communication Conference 19(2009). [4] K. Kaya, B. Uçar and C. Aykanat, Heuristics for scheduling file-sharing tasks on heterogeneous systems with distributed repositories, J. Parallel Distrib. Comput. 67 (2007), pp. 271 285. [5] T. Kosar, M. Livny, Stork: Making data placement a first class citizen in the grid, in: Proceedings of 24th International Conference on Distributed Computing Systems, ICDCS (2004), pp. 342 349. [6] ] D. Yuan, Y. Yang, X. Liu, J. Chen, A Data Placement Strategy in Cloud Scientific Workflows, Future Generation Computer Systems (FGCS), 26(8)(2010), pp. 1200-1214. [7] T. Kosar, S. Son,G. Kola,M. Livny, Data placement in widely distributed environments, Advances in Parallel Computing 14(2005), pp. 105-128. [8] Fatma A. Omara,Mona M. Arafa, Genetic algorithms for task scheduling problem, Journal of Parallel and Distributed Computing 70(1)(2010), pp. 13-22. [9] Y. Yuan, X.Li,Q.Wang,X. Zhu, Deadline division-based heuristic for cost optimization in workflow scheduling, Information Sciences 179(15) (2009),pp. 2562-2575. [10] X.Tang, K.Li, R. Li,B.Veeravalli, Reliability-aware scheduling strategy for heterogeneous distributed computing systems, Journal of Parallel and Distributed Computing 70(9)(2010),pp. 941-952. [11] Z. Shi,Jack J. Dongarra, Scheduling workflow applications on processors with different capabilities, Future Generation Computer Systems 22(6)(2006),pp. 665-675. [12] Q. Kang, H. He, H. Song and R. Deng, Task allocation for maximizing reliability of distributed computing systems using honeybee mating optimization, Journal of Systems and Software 83(11) (2010), pp. 2165-2174. [13] C. Chiu, Y. Yeh,J.Chou, A fast algorithm for reliability-oriented task assignment in a distributed system, Computer Communications 25(17)(2002),pp. 1622-1630. [14] A. Chervenak, E. Deelman, M. Livny, M.-H. Su, R. Schuler, S. Bharathi, G. Mehta, K. Vahi, Data placement for scientific applications in distributed environments, in: 8th Grid Computing Conference, (2007), pp. 267 274. [15] G. Singh, K. Vahi, A. Ramakrishnan, G. Mehta, E. Deelman, H. Zhao, R. Sakellariou, K. Blackburn, D. Brown, S. Fairhurst, D. Meyers, G.B. Berriman, J. Good and D.S. Katz, Optimizing workflow data footprint, Scientific Programming 15 (2007), pp. 249 268. 1640