CHAPTER 5 PARALLEL GENETIC ALGORITHM AND COUPLED APPLICATION USING COST OPTIMIZATION

Size: px

Start display at page:

Download "CHAPTER 5 PARALLEL GENETIC ALGORITHM AND COUPLED APPLICATION USING COST OPTIMIZATION"

Sharon York
5 years ago
Views:

1 124 CHAPTER 5 PARALLEL GENETIC ALGORITHM AND COUPLED APPLICATION USING COST OPTIMIZATION 5.1 INTRODUCTION Cloud Computing provides on demand access of resources over the network. The main characteristics of virtualization technologies employed in Cloud atmosphere is the consolidation and proficient management of the resources. The current work employs an optimized scheduling algorithm, which concentrates on the efficient utilization of the resources for the cloud scheduling problems. A Parallel Genetic Algorithm with the Dynamic Deme model is used for scheduling the resources dynamically. The investigation shows that the scheduling procedure improves the utilization rate of the system resources and also the pace of allotment of the resource. The user could access the computing resources as general utilities, which can be acquired and released at any time. Access to Cloud resources easily enables the simultaneous use of many clouds. The system analyzes the viability from the view point of scalability, performance, and cost of deploying large virtual cluster infrastructures distributed over different cloud providers for solving loosely coupled Many Task Computing (MTC). The performance of different cluster configurations can be evaluated using the cluster throughput as a performance metric.

2 PARALLEL GENETIC ALGORITHM FOR RESOURCE SCHEDULING IN CLOUD The resource scheduling is a crucial process in cloud applications such as IaaS. The existing approach worked with Parallel Genetic Algorithm (PGA) used for resource allocation and utilization of system resources. Thereby in the proposed model addressed a novel approach called PGA with Dynamic Deme, efficiently scheduling the resources in cloud environment dynamically. The most important advantage of PGAs is that in many cases they provide better performance than single population-based algorithms, even when the parallelism is simulated on conventional machines. The reason is that, multiple populations permit speciation, a process by which different populations evolve in different directions. For these reasons PGAs are not only an extension of the traditional GA sequential model, but they represent a new class of algorithms in which, they search the space of solutions differently. This proposed work focuses on the analysis of the performance of the Dynamic Demes algorithm for cloud resource scheduling in an efficient manner. The investigation shows that, the scheduling procedure improves the utilization rate of the system resources and also the pace of allotment of the resource Architecture Diagram The above architecture explains the architecture new of Parallel Genetic Algorithms (PGA) which is scalable to the large systems, commonly found in clouds. The initial step begins with analyzing the process using the simulation kit. From that, the input resources and instead request can be

3 126 divided which are the resource of PGA with Dynamic Deme Model. The next block consist of PGA with coarse grained and Dynamic Deme model which well do the optimization of resource allocation and the allocated resources are located in the allocation sequence block. These block resources performance parameters may be gathered and sent to the performance analysis block. The actual performance evaluation can be carried out in the performance analysis model. The next block is called as performance report block which is meant for logging performance. After taking the performance report, the allocation of resources will be terminated which is referred in the termination block Initiate the Process using the Simulation tool Input Resource & Instance Request PGA Scheduler with Coarse-Grained & Dynamic Deme model Allocation Sequence Performance Analysis Performance Report Termination Figure 5.1 Architecture Diagram of Parallel Genetic Algorithm

4 Methodology The management of a cloud leaves providers with difficult tasks of dynamically provisioning a large-scale system to meet customer s demands. Traditional optimization techniques cannot properly handle the scale of leading cloud environments. The research examines stochastic optimization strategies using Parallel Genetic Algorithm which are scalable to the large systems and commonly found in clouds to optimize utilization of available servers and improve the timely service of customer requests. The basic idea behind most parallel programs is to divide a task into chunks and to solve the chunks simultaneously using multiple processors. This divide-and-conquer approach can be applied to GAs in many different ways, and the literature contains many examples of successful parallel implementations. Some parallelization methods use a single population, while others divide the population into several relatively isolated subpopulations. Some methods massively exploit parallel computer architectures, while others are better suited to multicomputer with fewer and more powerful processing elements. A novel attempt of implementing Parallel Genetic Algorithm with the Dynamic Deme model has been absorbed in the current work for scheduling the resources. This method principally aims at allocating the resources, in a more competent way by utilizing the available resources in Cloud Environment (IaaS). Allocation of the resource is based on the instance request, provided by the user. The PGA Scheduler uses the Dynamic Deme Model of the Parallel Genetic Algorithm for scheduling the resources. The project is implemented in java language with the help of the Integrated Development Environment (IDE) Jcreator.

5 Genetic Algorithm Genetic algorithms are inspired by Darwin's theory about evolution. Genetic Algorithms (GAs) are efficient search methods based on principles of natural selection and genetics. GAs are generally able to find good solutions in reasonable amount of time, but as they are applied to harder and bigger problems, there is an increase in the time required to find adequate solutions. As a consequence, there have been multiple efforts to make GAs faster, and one of the most promising choices is to use parallel implementations. Components of a Genetic Algorithm Encoding technique Initialization procedure Evaluation function Selection of parents Genetic operators A typical algorithm might consist of the following: A number of randomly chosen guesses of the solution to problem - the Initial Population. A means of calculating how good or bad each guess is within the population - a Population Fitness Function. A method for mixing fragments of the better solutions to form new and on average even better solutions - Crossover. An operator to avoid permanent loss of (and to introduce new) diversity within the solutions - Mutation.

6 Parallel Genetic Algorithm For some kind of problems, the population needs to be very large and the memory required to store each individual may be considerable. In some cases this makes it impossible to run an application efficiently using a single machine, so some parallel form of GA is necessary. Fitness evaluation is usually time-consuming and the only practical way to provide the required CPU power is to use parallel processing. The most important advantage of Parallel Genetic Algorithms (PGA) is that in many cases they provide better performance than single population-based algorithms, even when the parallelism is simulated on conventional machines. The reason is that, multiple populations permit speciation, a process by which different populations evolve in different directions. For these reason Parallel GAs are not only an extension of the traditional GA sequential model, but they represent a new class of algorithms in which they search the space of solutions differently. Master-Slave Parallelisation: Master Slave Parallelisation method, also known as distributed fitness evaluation, is one of the first successful applications of parallel GAs. It is also known as global parallelisation, master-slave model or distributed fitness evaluation. The algorithm uses a single population and the evaluation of individuals and the application of genetic operators are performed in parallel. The selection and mating is done globally, hence each individual may compete and mate with any other individual. The operation that is most commonly parallelised is the evaluation of the fitness function, because, normally it requires only the knowledge of the individual being evaluated (not the whole population), and so there is no

7 130 need to communicate during this phase. This is usually implemented using master slave programs, where the master stores the population and the slaves evaluate the fitness, apply mutation, and sometimes exchange bits of the genome (as part of crossover). Parallelisation of fitness evaluation is done by assigning a fraction of the population to each of the processors available (in the ideal case one individual per processing element). Communication occurs only as each slave receives the individual (or subset of individuals) to evaluate and when the slaves return the fitness values, sometimes after mutation has been applied, with the given probability. The algorithm is said to be synchronous, if the master stops and waits to receive the fitness values for all the population, before proceeding with the next generation. A synchronous master-slave GA has exactly the same properties as a simple GA, except for its speed, i.e. this form of parallel GA carries out exactly the same search as a simple GA. An asynchronous version of the master-slave GA is also possible. In this case, the algorithm does not stop to wait for any slow processors. For this reason the asynchronous master slave PGA does not work exactly like a simple GA, but is more similar to parallel steady-state GAs. The difference lies only in the selection operator. In an asynchronous masterslave algorithm, selection waits until a fraction of the population has been processed, while in a steady-state GA selection does not wait, but operates on the existing population. A synchronous master-slave PGA is relatively easy to implement and a significant speedup can be expected, if the communication cost does not dominate the computation cost.

8 131 Drawback: However, there is a classical bottle-neck effect. The whole process has to wait for the slowest processor to finish its fitness evaluations. After that, the selection operator can be applied. The asynchronous master-slave PGA overcomes this, but as stated before, the algorithm changes significantly the GA dynamics, and as a result it is difficult to analyse. Subpopulations with Migration: The important characteristics of the class of static subpopulations with migration parallel GAs are the use of multiple demes and the presence of a migration operator. Multiple-deme GAs is the most popular parallelization method, and many concepts have been proposed describing details of their implementation. These algorithms are usually referred to as subpopulations with migration, static subpopulations, multiple-deme GAs, coarse-grained GAs and even just parallel GAs. This parallelisation method requires the division of a population into some number of demes (subpopulations). Demes are separated from one another (geographic isolation), and individuals compete only within a deme. An additional operator called migration is introduced from time to time; some individuals are moved (copied) from one deme to another. If individuals can migrate to any other deme, the model is called an island model. If individuals can migrate only to neighbouring demes, it is termed as stepping stone model. There are other possible migration models. The migration of individuals from one deme to another is the topology that defines the connections between the subpopulations. Commonly used topologies include:

9 132 hypercube, two-dimension, three-dimensional mesh, torus, etc. The migration rate controls how many individuals migrate in a migration scheme, and also controls which individuals from the source deme (best, worst, random) migrate to another deme, and which individuals are replaced (worst, random, etc.). A migration interval determines the frequency of migrations. Coarse grained algorithms are a general term for a subpopulation model with a relatively small number of demes with many individuals. These models are characterised by relatively long time as they require for processing a generation within each ( sequential ) deme, and by their occasional communication for exchanging individuals. Sometimes coarse grained parallel GAs is known as distributed GAs, because they are usually implemented on distributed memory Multiple Instruction Multiple Data (MIMD )computers. This approach is also well suited for heterogeneous networks. Fine grained algorithms function is the opposite way. They require a large number of processors, because the population is divided into a large number of small demes. Inter-deme communication is realised, either by using a migration operator, or by using overlapping demes. Recently, the term fine-grained GAs, was redefined and is now used to indicate massively parallel GAs. Constraint: The multiple-deme model presents one problem: scalability. If one has only a few machines, it is efficient to use a coarse grained model. However, if one has hundreds of machines available at a time, it is difficult to scale up efficiently the size and number of subpopulations, to use the

10 133 hardware platform efficiently. Despite this problem, the multiple-deme model is very popular. From the implementation point of view, multiple-deme GAs are simple extensions of the serial GA. It s enough to take a few conventional (serial) GAs, run each of them on a node of a parallel computer, and to apply migration, at some predetermined times. Dynamic Demes: Dynamic Demes is a new parallelization method for GAs which allows the combination of global parallelism with a coarse-grained GA. In this model, there is no migration operator as such, because the whole population is treated during evolution, as a single collection of individuals, and information between individuals is exchanged via a dynamic reorganization of the demes, during the processing cycles. From the parallel processing point of view, the dynamic demes approach fits perfectly the MIMD category (Flyn classification) as an asynchronous multiple master-slave algorithms. The main idea behind this approach is to cut down the waiting time for the last (slowest) individuals to arrive in the master-slave model, by dynamically splitting the population into demes, which can then be processed without delay. This is efficient in terms of processing speed. In addition, the algorithm is fully scalable. Starting from a global parallelism with fitness-processing distribution, one can scale up the algorithm to a fine grained version, with few individuals within each deme and big numbers of demes. The algorithm can be used on shared and distributed memory parallel machines. Its scalability can prove vital in systems

11 134 with a few Processing Elements, as well as, in massively parallel systems with large number of Processing Elements, and everything in between. Dynamic Demes (DD) is scalable and an easily implemented method of GA parallelization. Advantage: The main advantages of dynamic demes are: High scalability and flexibility (DDs can be used to implement a broad range of algorithms from coarse grained to highly fine grained models) Fault tolerance (some of the processors can crash, but the algorithm will correctly continue the operation) Dynamic load balancing and Easy monitoring. Algorithm Description: Each individual is represented by a separate process called as slave, which is capable of performing the following: i) Fitness evaluation ii) iii) Applying mutation to itself (with a predefined mutation rate) Performing crossover with another individual (this is done by passing to each individual, the process ID of another individual, with which it should perform crossover) All the individuals run concurrently. The ideal case is when a single processing element processes a single individual. There are additional processes, called masters, which are responsible for selection and mating. Masters handle a fixed fraction of the population and apply selection and mating on it. Therefore, each master represents a separate deme. However,

12 135 unlike other PGAs, in Dynamic Deme (DD) the individuals belonging to each deme, change dynamically. The number of masters is a parameter of the algorithm. If there is only one master, DDs function as a classic distributed fitness evaluation algorithm. Each master process performs selection and mating concurrently with the other master. Mating requires sending the appropriate slave ID to the individuals chosen for crossing over. When the slaves receive a partner ID they perform crossover, and then proceed with fitness evaluation and mutation. In addition to masters and slaves, there is also a process (possibly more) responsible for load balancing, called counters. After crossover, fitness evaluation and mutation, each individual is dynamically assigned to a deme (possibly different from the one it belonged to previously). This happens when the individual notifies the counter process. The counter process knows which master processes are currently, idle waiting for their subpopulation to be filled and it sends to the individual the process ID of one such master. The last process within the system is called sorter. This process is informed by all of the individuals finishing their evaluation, takes their genotype and fitness, and saves them in appropriate log files. The sorter process is also responsible for stopping the search, when a termination criterion is met Modules The flow of the work consists of three different modules, which comprises of resource specification followed by the execution of the PGA, resulting in the allocation sequence. The virtual machine request is provided as the input parameter to the system, along with the number of iterations to be carried out for the execution of the genetic algorithm. Based on the virtual machine, specified cloudlets are created. The scheduling of the resource is

13 136 carried out using the parallel genetic algorithm and the final allocation sequence of the resources to the instance request, is obtained when the simulation ends Modules Description i) Creation of Cloud Environment: The initial process is used to create the Cloud Environment for the execution of the algorithm with the help of the simulation tool. ii) Resource Listing: The available resource list, is updated when allocation or de-allocation of the resources takes place. The request from the clients are collected and updated in the VM request list, when ever new VM requests makes it arrival. Each request is identified by a separate VM id. The instance request is provided in terms of Cloudlet id s. The Cloudlet id is defined in the system based on the Virtual Machine request. iii) Sequencing: A Parallel Genetic Algorithm is implemented using Dynamic Deme Model, to calculate the fitness and to find the optimal allocation sequence, among the available pool of resource. The Genetic operation is performed using the concept of threads. A thread, called Slave, performs the, Fitness evaluation, Mutation and Crossover operation. Another thread, called Master, performs the Selection operation and mating process. The third thread is used to perform the load balancing operation and the last thread indicates the process of termination. Based on the instance request, the PGA finds the optimal resource from the available resources. iv) Allocation: This module focus on launching the optimal resource provided by the PGA to the corresponding Instance

14 137 requests. The VM id which is more optimal to the instance request is being assigned. Likewise all the VM id s are assigned to the corresponding instance requests based on the expected resource. IRs Request Scheduler PGA Algorithm Computing Resource Allocation Listing Figure 5.2 Modules of the Simulator Procedure for PGA Implementation The implementation of the proposed concept is in the form of a simulation using the simulation tool CloudSim.In the simulation, a PGA; simulator is simulated, which acts as a scheduler for the cloud. The goal of the Scheduler is to find out the allocation sequence to each computing node, in a cloud, so that, the instances run on proper physical computers. The automated scheduling model is being divided into three steps. First, the scheduler updates the available resource list, when allocation or de-allocation happens and update the VM request list when each time new VM requests come. Then, the scheduler uses a PGA to find out a fitness and economical allocation. Finally, the cloud launches the corresponding VMs at the physical resource for the VM request.

15 138 To run a GA, the two most important factors are Chromosome Representation and Fitness Function Evaluation. Chromosome representation: The integer notation is used to represent the computing resources. The chromosome pattern represents the corresponding IRs with the VMs. (7 2 1). The first request is assigned to VM id 7 and the second to the VM id 2 and so on. Fitness Function Evaluation: Fitness function provides the mechanism for evaluating each chromosome in the problem domains. It is calculated as the summation process and is used to select the best resource series. Upon selecting the best resource series the basic genetic operations are performed. m n F C X where i= 1, 2, n, j=1,2,..m; (5.1) j 1 i 1 ij ij F is the total fitness of an allocation scheme, m represents the node and n represents the instance request. The value of X ij is either 0 (if Ith IR is not assigned to Jth node) or 1 (if Ith IR is assigned to J th node) ij 3 C P (5.2) K 1 K where P K = a if VM K / node K = 1 b if VM K / node K < 1 c if VM K / node K > 1 K is a label, when k is equal to 1, it represents the CPU, and when k is equal to 2, it represents the Memory capacity,3 represents the capacity of the disk. C is the fitness of assigning i th IR to j th node, and F is the total fitness of an allocation

16 139 scheme. Finally, the calculated fitness value is added with a big number to get a positive value. Genetic Operations: The basic operations include replication, crossover and mutation. Usually single point crossover is performed. The above mentioned procedure is repeated concurrently till the optimal solution is obtained. Steps for Dynamic Deme Algorithm Execution: 1. Input: Initial population of individuals 2. Evaluate the fitness of all individuals 3. While termination condition not met do { 4. Slave process performs i. Fitness Evaluation ii. Mutation iii. Crossover 5. Master process performs i. Selection ii. Mating 6. Counter process, performs the dynamic re-organization of population. 7. Sorter process, copies the fitness values and also responsible for terminating the searching process. } End while where master and slave process are performed concurrently.

17 Performance Evaluation The performance evaluation of PGA with DD model was compared with the existing PGA algorithm to evaluate the performance metric such as evaluation time for resource allocation based on the various resource requests for allocating the resources. Figure 5.3 shown below has given the detail of the Cloud Environment created for proposed Dynamic Deme model by using the CloudSim. Also Figure 5.4 denotes the Simulation of DD algorithm for the resource allocation using PGA. Figure 5.3 Cloud Environments for Dynamic Deme Figure 5.4 Simulations for Dynamic Deme Algorithm

18 141 Table 5.1 shown below explains the evaluation time required for the number of tests conducted for resource allocation using existing PGA method and proposed PGA with DD model. The results expressed the evaluation time consumption for resource allocation using PGA with DD is comparatively better than the existing approach. Table 5.1 Performance Evaluation for the Resource Allocation in PGA with DD No of Test PGA Evaluation Time (ms) No of Test PGA with DD Evaluation Time (ms) Based on the obtained values of evaluation time for the number of tests conducted, the graph as been plotted as shown in Figure 5.6, with representing number of tests in X-axis and evaluation time in Y-axis respectively. The performance curve clearly explains that the proposed method consumed very less time to allocate the resources in cloud environment.

19 142 Figure 5.5 Evaluation Time for Number of Tests 5.3 COUPLED APPLICATIONS USING COST OPTIMIZATION Cloud computing technologies can offer important benefits for IT organizations and data centers running MTC applications. The challenges and viability of deploying computing clusters are analyzed in the earlier system for loosely coupled MTC applications with the help of three different Cloud networks like private, public and hybrid. The system analyzes the performance of different cluster configurations, using the cluster throughput as performance metric. The Multi Cloud deployment involves several challenges. A performance and cost analysis for different configurations of the real implementation, of a multi-cloud cluster infrastructure, running a real workload. However, due to hardware limitations in local infrastructure, and the high cost of renting many cloud resources for long periods, the tested cluster configurations, are limited to a reduced number of computing resources (up to 16 worker nodes in the cluster), running a reduced number of tasks (up to 128 tasks).

20 143 The upfront challenge are the constraints of Cloud Interface Standard, the distribution and management of the service master, images and interconnects the links between the service components. The clusters are deployed in a hybrid setup, which combines local physical nodes with virtual nodes, deployed in another compute cloud. Comparing the different cluster configurations, and proving the viability of the MultiCloud solution proves cost effective Architecture Diagram Submitted Jobs for Processing Clustering Jobs Private Cluster Public Cluster Hybrid Cluster Figure 5.6 Hybrid Cluster Architecture Diagram A new approach for hybrid cluster called Path Clustering Heuristic (PCH) algorithm is used for the initial Schedule scheme, to overcome the above stated problem and to achieve cost optimization. The Hybrid Cloud systems are a novel research challenge, which comes together with the merging of private and Public Clouds. In this method, the different cluster configurations are considered with PCH algorithm dynamically and the cluster nodes can be provisioned with resources from different clouds, to improve the cost effectiveness of the deployment, or to implement highavailability strategies.

21 Methodology as follows: Implementation of PCH algorithm has different modules which are Modules The modules are, i) Creation of Cloud Environment ii) Implementation of Scheduling process in Private Cloud iii) Implementation of Scheduling process in Public Cloud iv) Implementation of Scheduling process in Hybrid Cloud using PCH algorithm Modules Description follows: The module descriptions for the above stated modules are as i) Creation of Cloud Environment: Creating a Cloud network model, for the simulation of new approach is the initial step. Cloud computing paradigm is being widely used for the execution of many types of applications, including ones with data dependencies, which can be represented by workflows. To execute such workflow applications, in a hybrid cloud, the scheduling algorithm must take cost into consideration and execution time. Cost and execution time play an important role in the cloud environment. High-availability and fault tolerance: The cluster worker nodes can be spread on different cloud sites. In case of cloud downtime or failure, the cluster operation

22 145 will not be disrupted. Furthermore, in this situation, it is admissible to dynamically deploying new cluster nodes in a different cloud, to avoid the degradation of the cluster performance. Infrastructure cost reduction: Different cloud providers can follow different pricing strategies, and even variable pricing models (based on the level of demand of a particular resource type, daytime versus night-time, weekdays versus weekends, spot prices, and so forth), the different cluster nodes can change dynamically their locations, from one cloud provider to another one, in order to reduce the overall infrastructure cost. A flexible and generic cluster architecture that combines the use of virtual machines and cloud computing, dynamically delivers in the heterogeneous computational environments. Moreover, the introduction of a new virtualization layer between the computational environments and the physical infrastructure, makes it possible, to adjust the capacity, allocated to each environment and to supplement them with resources, from an external cloud provider. ii) Implementation of Scheduling Process in Private Cloud: The Scheduling process for Private Cloud network model is implemented. Private Cloud is infrastructure operated, solely for a single organization, whether managed internally or by a third-party and hosted internally or externally. Here, resources that can be accessed and used by individuals inside an organization, that is similar to data farms or private grids. Also

23 146 it tries to balance the use of private resources, with the ones available from the Public Cloud. Private Cloud (also called internal cloud or corporate cloud) is a marketing term, for a proprietary computing architecture, that provides hosted services to a limited number of people behind a firewall. Real-time monitoring of the condor job queue and virtual machines that belong to individual Virtual Organizations are provisioned and booted. Jobs belonging to each Virtual Organization are then operator on the organization specific virtual machines, which form a cluster, dedicated to the specific organization. Once the queued jobs have been executed, the virtual machines are terminated, thereby allowing the physical resources to be re-claimed. Tests of this system were conducted using synthetic workloads which demonstrate that, dynamic provisioning of virtual machines preserves system throughput for all. The shortest-running of grid jobs, without undue increase in scheduling latency and the deployment requires root privileges on remote resources, which have made difficult when dynamic deployment is done on those sites. iii) Implementation of Scheduling Process in Public Cloud: The implementation of the Scheduling process for the Public Cloud network model is considered in this module. A Public Cloud is the one, based on the standard cloud computing model, in which a service provider makes resources, such as applications and storage, and is available to the general public over the internet. Public Cloud services may be free or offered on a pay-perusage model. Public Cloud describes cloud computing in the

24 147 traditional mainstream sense, whereby resources are dynamically provisioned to the general public on a finegrained, self-service basis over the internet, via web applications/web services or from an off-site third-party provider who bills on a fine-grained utility computing basis. To users and applications, the process of borrowing nodes is transparent. A VM running as part of a VioCluster is practically indistinguishable from a physical machine running inside the same domain. Dynamic machine trading is activated between mutually isolated virtual domains. VioCluster creates software-based network components, which seamlessly connect physical and virtual machines, to create isolated virtual domains. Machines can be traded dynamically, through the on-demand creation, deletion, and configuration of VMs and network components. Dynamic negotiation of machine trades: Each virtual domain includes a machine broker which interacts with other domains. Requests and offers are made through these brokers based on workload and configurable lending and borrowing policies. Building a prototype of the VioCluster system, have demonstrated its effectiveness using two independent Portable Batch System (PBS) based job-execution clusters. The performance evaluation results show benefits to both clusters by increasing their resource utilization and decreasing their job execution times. Physical Domain: An autonomous set of networked computers is managed as a unit. Physical domains have a

25 148 single administrator, and support a user-base, performing specific computational activities. For example, a physical domain belonging to a biology department may be optimally configured for cellular simulations, while a physical domain belonging to a network research group, may be designed for shorter network intensive experiments. Virtual Domain: An autonomous set composed of virtual and physical machines, is managed as a unit. Machines in a virtual domain are connected through a virtual private network, to which both virtual and physical machines have access. Virtual domains are able to grow and shrink on demand, and to the administrator they appear to be identical as that of physical domains. A one-to-one mapping exists between physical and virtual domains; every virtual domain is hosted upon a physical domain. Machine Broker: It is a software agent that represents a virtual domain when negotiating trade agreements with other virtual domains. A machine broker consists of a borrowing policy which determines, under which circumstances, it will attempt to obtain more machines, and a lending policy, which governs, when it is willing to let another virtual domain, make use of machines within its physical domain. Both policies are defined by the domain s administrator. iv) Implementation of Scheduling Process in Hybrid Cloud using PCH Algorithm: A new approach for hybrid clusters called Path Clustering Heuristic (PCH) algorithm is introduced for the initial schedule scheme. The Hybrid Cloud

26 149 systems are a novel research challenge that comes together with the merging of Private and Public Clouds. It checks the private resources whether it already satisfies the deadline. Deploying a Hybrid Cloud, offers support or automatic service installation in the resources, which are dynamically provided by the grid or by the cloud, to execute the PCH algorithm. In PCH algorithm, all the information necessary to compute these attributes, are given by the programming model or by the infrastructure. New cluster management architecture for shared mixed-use clusters is followed. The key feature of Cluster-on-Demand (COD) is supporting configurable dynamic virtual clusters, which associates variable shares of cluster resources with application service environments, e.g., batch schedulers and other grid services. The COD site manager assigns nodes to v clusters according to demand and site policies, based on dynamic negotiation with a pluggable service manager for each dynamic v cluster. Experimental results with the COD prototype and a service manager for the Sun Grid Engine (SGE) batch service demonstrates the potential of dynamic virtual clusters and resource negotiation as a basis for dynamic provisioning and other advanced resource management operations, for future grid systems. The results prove that the key needs for grid resource management can be met directly by generic site management features which are independent of any specific application or middleware environment. A Well Known Address (WKA) based on membership discovery and management scheme can be used on environments where multicasting is not possible. There are

27 150 one or more members which are assigned with well known IP addresses. All other members are aware about these well known members. At least one well known member should be started up before any other member. It is also possible to assign a well known address to the member which started up first. An elastic IP address can be assigned to this first member. When other members boot up and try to contact one or more well known members the WKA has the ability to send a JOIN message. The well known, member will add this new member to its membership list, and notify all other members about this new member who has joined, by sending a MEMBERJOINED message to the group, and will send the MEMBERLIST message to the newly joined member. Now, all group members will be aware about this new member who has joined, and the new member will learn about its group membership. Auto scaling, Axis2, Web service applications on Amazon EC2, are a very appealing ideas from a business point of view. Such an approach, makes efficient usage of resources on a cloud computing environment, and achieves an optimal balance between performance, cost and availability & scalability guarantees. An assumption of a virtual homogeneous system is composed of an unbounded number of best available processor connected by links with the highest available bandwidth. Each task is scheduled on a different processor on the virtual system, and then the algorithm computes the initial attribute values of each node. The decision is based on performance,

151 cost, and the number of services to be scheduled, in the Hybrid Cloud using PCH algorithm. 5.3.2.

The system will analyze and compare the performance offered by different configurations of the computing cluster, and the evaluation comparison was performed by evaluating the parameter metrics such

28 151 cost, and the number of services to be scheduled, in the Hybrid Cloud using PCH algorithm Performance Evaluation A new approach is evaluated with the earlier approaches for identifying the utilization of resources. The system will analyze and compare the performance offered by different configurations of the computing cluster, and the evaluation comparison was performed by evaluating the parameter metrics such as the viability, from the point of view of, Scalability, Execution time, Performance and Cost. Based on the comparison and results, it is clear that the proposed new approach works better than the other earlier systems. Figure 5.7 shown below represents the creation of Hybrid Cloud Environment which consists of Cloudlet and VM creation for performance evaluation. Figure 5.7 Cloudlet and Virtual Machine Creation Figure 5.8 has given the details of simulation results of the Cost optimizing techniques in Hybrid PCH.

152 Figure 5.8 Simulation Result of the Cost Optimizing Technique The performance evaluations of various metrics stated above are as follows by comparing the both the Hybrid with PCH and Hybrid Cloud.

29 152 Figure 5.8 Simulation Result of the Cost Optimizing Technique The performance evaluations of various metrics stated above are as follows by comparing the both the Hybrid with PCH and Hybrid Cloud. Table 5.2 shown below gives the cost optimization for the number of tasks performed to the utilization of resources in Hybrid with PCH and Hybrid Cloud. The cost optimization for the proposed work is less when compared to the existing system of the cloud environment. Table 5.2 Cost Optimization for the Number of Task Hybrid with PCH Hybrid Cloud No. of Task Cost No. of Task Cost , , , , , , ,000

Hybrid Cloud. The X-axis denotes Number of Task and Y-axis denotes Cost respectively.

30 153 Figure 5.9 illustrates the graphical representation of cost optimization for the number of tasks executed to the utilization of resources in both Hybrid with PCH and Hybrid Cloud. The X-axis denotes Number of Task and Y-axis denotes Cost respectively. Cost optimization is comparatively less which is shown in the graphical representation. Figure 5.9 Cost optimization for different Tasks Table 5.3 shown below given the throughput for the utilization of resources carried out for the number of tasks for both Hybrid with PCH and Hybrid Cloud. Here, the throughput of the proposed system is comparatively higher than that of the existing system.

31 154 Table 5.3 Throughput obtained for various task Hybrid with PCH Hybrid Cloud No of Task Throughput No of Task Throughput Figure 5.10 represents the throughput for the number of tasks in both Hybrid with PCH and Hybrid Cloud in the cloud environment. Increase in throughput leads to the performance improvement of the system. The graphical representation of the throughput is depicted with Number of Task in the X-axis and Throughput in the Y-axis. Figure 5.10 Throughput obtained for different Tasks

of Task Scalability 10 600 10 650 14 750 14 770 15 900 15 920 25 1100 25 1200 50 1250 50 1250 80 2200 80 2250 100 3000 100 5000 Scalability for the number of tasks obtained for both

32 155 The scalability values of the systems for the utilization of resources are shown in Table 5.4 for the number of tasks. Here, the efficiency of the system is increased with that of the scalability. Table 5.4 Scalability obtained for various task Hybrid with PCH Hybrid Cloud No. of Task Scalability No. of Task Scalability Scalability for the number of tasks obtained for both Hybrid with PCH and Hybrid Cloud in the cloud environment is demonstrated in Figure 5.11 with X-axis representing the number of task and Y-axis representing the scalability respectively. Figure 5.11 Scalability of the system for different Tasks

33 156 The resource utilization value of the both, Hybrid with PCH and Hybrid Cloud is shown in Table 5.5 representing the values of utility usage for number of tasks. Table 5.5 Resource Utilization for the Number of Task Hybrid with PCH Hybrid Cloud No. of Task Utility No. of Task Utility Figure 5.12 show case the utilization of the system resource for the number of tasks executed in both Hybrid with PCH and Hybrid Cloud. From the graphical representation it is clear that the utilization usage of the proposed system is less than that of the existing system and the X-axis denotes Number of Task and Y-axis denotes the Utility Usage respectively.

34 157 Figure 5.12 Utility of the system for different Tasks 5.4 SUMMARY The main characteristics of virtualization technologies applied in cloud environment are consolidating the resources, which will lead to efficient management of resources. Here, two methods are addressed for various optimized scheduling algorithms. The first method proposed, focused on efficient utilization of resources, by using the parallel genetic algorithm for Dynamic Deme model. This method investigates the scheduling procedure to improve the utilization rate of the system resources. Due to this, the allotment and releasing of the resources are done efficiently. The next method is used to analyze the viability, from the view point of scalability, performance; cost of deploying large virtual infrastructure, distributed over different cloud providers for solving loosely coupled MTC. The performances of different cluster configurations are evaluated with the performance metrics (cost optimization, throughput, scalability and utility). Based on the evaluations, the proposed method for resource scheduling is done very effectively.

Multiprocessor Scheduling Using Parallel Genetic Algorithm

www.ijcsi.org 260 Multiprocessor Scheduling Using Parallel Genetic Algorithm Nourah Al-Angari 1, Abdullatif ALAbdullatif 2 1,2 Computer Science Department, College of Computer & Information Sciences, King