A Multi-Objective Approach for QoS-aware Service Composition Marcel Cremene, Mihai Suciu, Florin-Claudiu Pop, Denis Pallez and D. Dumitrescu Technical University of Cluj-Napoca, Romania Babes-Bolyai University of Cluj-Napoca, Romania Université de Nice Sophia-Antipolis, France Abstract Service composition is one of the main research areas in the Software Oriented Computing (SOC) domain. A certain functionality may be offered by different services having different Quality of Service (QoS) attributes. QoS optimization problem is multi-objective by its nature. Some multi-objective optimization algorithms have been used for solving this problem. Their advantages, compared to single-criteria algorithms, are: (i) the aggregation of criterion functions is avoided and (ii) the user has the possibility to select a posteriori one of the Pareto optimal solutions. There is a small number a such proposals and very few comparative tests. Comparative tests with the most known multi-criteria optimization algorithms are presented. Differential evolution based multi-objective algorithms seems to be a good trade-off between performance and complexity. The converge time of multi-criteria optimization algorithms is comparable to the single-criteria algorithms. Index Terms Service Oriented Computing, Multi-criteria Optimization, Differential Evolution, Quality of Service I. INTRODUCTION Service Oriented Architecture (SOA) implementations are more and more popular, diverse and widespread in enterprise distributed environments. This fact is due to their technical advantages over more traditional methods of distributing computing. These advantages include: delivering application functionality as services across several platforms, providing location independence, authentication and authorization support and dynamic search and connectivity to other services. The QoS optimization problem is multi-objective by its nature. The user might prefer to see several good solutions (Pareto optimal) and decide which is the best for himself/herself while the standard optimization algorithms offer only one solution. It is more natural to let the user decide if he/she wants to pay a specific known price than to ask him/her to specify a priori how important is the price for him/her and without knowing the precise price value. By using multicriteria optimization, we also avoid defining an aggregation function, which is not a trivial task. II. PROBLEM STATEMENT A composite service can be described as a process that involves the execution of several activities according to a workflow. An example workflow for a flight booking process is depicted in Fig. 1. The QoS of the composite service is obtained by aggregating the QoS of the component services. Examples of aggregation rules are given in [1], [2] and [3]. Given m abstract services and n concrete services for each abstract service, there are n m possibles combinations. Finding the QoS optimal solution is an non-deterministic polynomialtime hard (NP-hard) problem [1]. III. RELATED WORK The problem stated previously is well known in domains like Service Oriented Computing (SOC) and Search-based Software Engineering (SBSE) [4]. We found it discussed in [1], [5], [6], [7]. G. Canfora et al. [1] compare a linear integer programming [8] based algorithm with a genetic algorithm. As a case study, they considered a workflow containing 8 distinct abstract services. Their conclusion was that GA is able to deal with QoS attributes having non-linear aggregation functions. Also, GA can scale-up when the number of concrete services per abstract service increases. Multi-objective algorithms applied to service composition. The idea of using multi-criteria optimization algorithms in the service composition domain is not new. Taboada et al. [9] propose a custom genetic algorithm called Multi-objective Multi-State Genetic Algorithm (MOMS-GA) for general system design optimization. The system components have different performance levels, cost, weight and reliability. A multi-state approach means that the system and its components may have more than two states: completely working, partially working, partially failed and completely failed. Yao et al. [3] propose an approach based on Non-dominated Sorting Genetic Algorithm-II (NSGA2) [10] algorithm. A solution is encoded using an integer vector, similar to [1]. Li Li et al. [2] propose another multi-criteria oriented approach based on the Strength Pareto Evolutionary Algorithm (SPEA2) [11] algorithm. Three criteria are considered: response time, cost and availability. The same type of genome encoding, integer vector based, is used. The algorithm parameters are very similar to the previous example: mutation rate 0.01, population size of 100 individuals, single-point crossover with a 0.95 rate and binary tournament selection. Population size is 100 and the archive population (specific to this algorithm) size is 200 individuals. A scenario with 4 abstract services and from 90 to 500 concrete services par abstract service was considered. The tests
Fig. 1. A flight booking abstract process have shown that SPEA2 converges to the Pareto-optimal front in less than 100 generations. Remarks about the related work. It seems that evolutionary methods are preferred for single and multi-objective approaches. Despite the fact that the QoS optimization problem is multi-objective by nature, these algorithms are less used than single-objective ones. We did not found a comparison between different multi-objective algorithms and an analysis about their complexity. The scenarios presented in the selected literature are using a small number of abstract service (less than 5) and some of them also a small number of concrete services (less than 9). Thus, it is not obvious how scalable are these algorithms. These facts motivated us to perform a comparative analysis using several different multi-objective algorithms and test more complex scenarios. The next section introduces the theoretical support for multi-objective optimization. IV. MULTI-OBJECTIVE OPTIMIZATION ALGORITHMS Let us consider n objectives defined by the set {f i } i {1,...,m} of real valued functions f i : X R, X R n. F : X R m is the vector valued function F (x) = (f 1 (x),..., f(m(x)). A decision vector x X is said to Pareto-dominate y X, denoted as x y, if and only if and i 1,..., m, f i (x) f i (y), j 1,..., n such that f j (x) < f j (y). A solution x X is Pareto-optimal if and only if y X such that y x. The Pareto-Optimal Set (POS) is defined as the set of all Pareto-optimal solutions P OS = {x X y X, y x}. The Pareto-Optimal Front (POF) is defined as the set of all objective functions values corresponding to the solution in POS. P OF = {F (x) x is non dominated}. NSGA2. Non-dominated Sorting Genetic Algorithms 2 (NSGA2) as a multi-objective optimization algorithm has been proposed by [12]. NSGA2 is an elitist algorithm, based on a µ, λ selection (µ parents + λ childs), favouring good solutions through a ranking and sorting system based on Pareto-dominance. SPEA2. Strength Pareto Evolutionary Algorithm 2 (SPEA2) [11] is an elitist multi-objective optimization algorithm. This algorithm uses an archive for storing the best solutions found at each generation. GDE3. A multi-objective version GDE3 [13] has been proposed by Kukkonen and Lampinen in 2005. GDE3 modifies the selection operator in DE by introducing the concept of dominance. The trial vector is selected to replace the decision vector if Pareto dominates x. Condition j = rand j indicates that at least one component of mutated vector v i is selected. If the vectors are indifferent to each other (neither is better) then both vectors are kept. POSDE. POSDE, one of the first multiobjective DE approaches was introduced by [14]. POSDE uses a secondary population to retain the non-dominated solutions found at each generation in the evolutionary process. Diversity is achieved by a distance metric used to alter the fitness of each individual when it is compared with the elements of the archive. ϵ-myde. ϵ-myde is a multi-objective algorithm based on Differential Evolution introduced in [15]. A secondary population is adopted in order to retain the non-dominated solutions found during the search process. The concept of ϵ-dominance [16] is used to assure a good distribution of solutions in the secondary population. V. NUMERICAL EXPERIMENTS In this section we describe a set of comparative tests using the previously described algorithms applied to the service QoS optimization problem. Each concrete service is defined by the QoS properties vector Q i,j = (t, r, c), t-response time, r-rating, c-cost. These properties correspond to the optimization objectives. The MOP goal is to find the best combination of concrete services that maximizes the overall rating and minimizes the overall cost and time.
Fig. 3. Final MOP solution for m=40 and n=40 Fig. 2. Final MOP solution for m=10 and n=20 The genome used to encode the solution is an integer vector for NSGA2 and SPEA2. Each gene is an integer value representing the index of the concrete service used. DE algorithms use floating point encoding to resolve problems in a continuous domain so we need to use a discrete version. There are several approaches for solving a finite domain problem using DE. Our approach is based on [17] and uses a mapping between the discrete and continuous domain. The DE genome is a real value vector. A. Experiments with two criteria A first set of experiments was performed using a workflow with a pure sequential architecture and three algorithms: NSGA2, SPEA2 and GDE3. For these experiments we considered only two criteria: rating and time + cost. The results for 10 abstract services and 20 concrete services are depicted in Fig. 2. B. Experiments with three criteria A second series of experiments was performed using more complex workflows (including sequences, split, join, loops and switch blocks) and with three independent criteria: response time, rating and cost. For all the algorithms the populations size was limited to 150 individuals, evolved for 300 generations, for DE CR=0.4 and F=0.3 was used. For HV a reference point (100,100,100) was chosen. For ϵ-myde a selection pressure of 0.8 was chosen, The size of the archive for POSDE, SPEA2 is 150 in order to assure a fair comparison. In a case with with m=10 abstract services and n=10 concrete services per abstract service (low complexity) all algorithms behave similarly. A more complex scenario involving m=20 abstract services, each of them having n=40 concrete services was evaluated. The added complexity makes it harder to find a good PS. SPEA2 doesn t find a good set while the DE approaches and NSGA seems to converge to the same PF. In a case with m=40 abstract services and n=40 concrete services per abstract service. Only NSGA2 and GDE3 are able to find a Pareto front and assure good diversity of solutions. The other algorithms produce a PS in which most solutions are dominated. C. Performance evaluation metrics The simple visual inspection of the solution front, as depicted in Fig. 2, is not clear enough for deciding which algorithm is the best. A more objective evaluation is necessary. According to [18] when comparing different algorithms we cannot rely only on one indicator. The metrics [19] used for comparison are: hypervolume (HV), spread (S) and set coverage (C). Tables I illustrate the mean (µ) and standard deviation(σ) of HV and S metrics. A higher value for each metric is desirable. In all scenarios NSGA2 and DE approaches obtain a better metric value assuring a good spread of solutions. If X and Y are two approximations of the PF, the set coverage C(X,Y) is used to measures the percentage of solutions in Y dominated by at least one solution in X. Table II presents the Set Coverage indicators for NSGA2, SPEA2, GDE3 compared with all other algorithms tested. In order to have an idea about the algorithms speed we measured the time required to find the solution set. Figure 3 presents the time required by all algorithms to determine a PS after 300 generations for all combinations of m {10, 20, 30, 40, 50} abstract and n {10, 20, 30, 40, 50} concrete services per abstract service. As expected, the execution time increases with the number of abstract services. It may be observed that all DE approaches require less time while NSGA2 is the slowest algorithm. VI. CONCLUSIONS Several multi-objective optimization algorithms are tested for solving the services QoS optimization problem. The tests indicate that NSGSA2 and DE-based algorithms are the best in terms of solutions diversity and Pareto optimality. NSGA2
TABLE I STATISTICAL RESULTS OF PERFORMANCE METRICS: hypervolume AND spread - NSGA2 AND DE APPROACHES OBTAIN A BETTER METRIC VALUE ASSURING A GOOD SPREAD OF SOLUTIONS. NSGA2 SPEA2 GDE3 m/n HV S HV S HV S σ µ σ µ σ µ σ µ σ µ σ µ 10/10 3.9E+05 4.9E+02 0.87 0.03 3.5E+05 1.3E+04 0.48 0.11 3.9E+05 4.7E+02 0.64 0.04 10/20 4.7E+05 4.5E+02 0.84 0.05 4.2E+05 1.5E+04 0.52 0.10 4.7E+05 6.6E+02 0.61 0.04 10/40 5.4E+05 8.8E+02 0.92 0.06 4.1E+05 2.4E+04 0.49 0.08 5.4E+05 7.0E+02 0.60 0.04 20/20 2.4E+13 2.1E+09 1.14 0.06 2.5E+13 2.1E+11 0.84 0.08 2.4E+13 2.8E+09 1.04 0.05 20/50 9.8E+12 5.7E+09 0.73 0.05 1.3E+13 6.2E+12 0.53 0.09 9.8E+12 3.0E+09 0.57 0.03 30/20 4.0E+05 2.6E+03 0.85 0.06 3.3E+05 9.3E+03 0.53 0.09 3.9E+05 2.0E+03 0.63 0.05 30/50 4.9E+05 3.8E+03 0.77 0.05 3.7E+05 1.6E+04 0.48 0.06 4.8E+05 1.9E+03 0.59 0.04 40/30 4.2E+05 4.5E+03 0.79 0.05 2.9E+05 9.9E+03 0.53 0.09 4.1E+05 3.2E+03 0.63 0.05 40/50 4.6E+05 5.7E+03 0.78 0.04 3.2E+05 1.5E+04 0.52 0.08 4.5E+05 2.6E+03 0.57 0.03 TABLE II SET COVERAGE INDICATOR - NSGA2, SPEA2, GDE3 VERSUS ALL (NSGA2 - A, SPEA2 - B, GDE3 - C, POSDE - D, ϵ MYDE - E) - NSGA2 AND ALL DE APPROACHES OUTPERFORM SPEA2. m/n C(a,b) C(a,c) C(a,d) C(a,e) C(b,a) C(b,c) C(b,d) C(b,e) C(c,a) C(c,b) C(c,d) C(c,e) 10/10 0.83 0.58 0.51 0.26 0.40 0.21 0.04 0.08 0.77 0.00 0.60 0.32 10/20 0.83 0.49 0.55 0.32 0.09 0.05 0.02 0.18 0.75 0.78 0.54 0.48 10/40 1.00 0.49 0.62 0.25 0.18 0.02 0.02 0.09 0.89 0.81 0.80 0.43 20/20 0.74 0.61 0.43 0.34 0.44 0.22 0.04 0.15 0.73 0.48 0.26 0.13 20/50 0.63 0.33 0.40 0.34 0.36 0.08 0.12 0.28 0.85 0.44 0.73 0.54 30/20 1.00 0.48 0.67 0.93 0.17 0.01 0.00 0.07 0.80 0.54 0.89 0.13 30/50 0.87 0.51 0.74 0.66 0.17 0.02 0.15 0.13 0.75 0.20 0.95 0.81 40/30 0.92 0.34 0.47 0.64 0.05 0.02 0.02 0.11 0.76 0.60 0.89 0.48 40/50 0.94 0.42 0.92 0.85 0.09 0.00 0.04 0.12 0.79 0.56 0.99 0.76 seems to have the best performances demonstrated by the evaluation metrics. But NSGA2 proven to be also the slower algorithm. Therefore, DE approaches seem to be a better choice for addressing complex problems. The time tests demonstrate that multi-criteria algorithms are able to solve the QoS optimization problems in a reasonable time: few seconds for very complex scenarios, on a standard PC machine. According to our tests, the time required by a multi-objective algorithm is comparable to that required by solving a single-objective problem. ACKNOWLEDGMENT This work was supported by CNCSIS-UEFISCSU, national project number PN II-TE code 252, contract 33/2010. It was also partially supported by a grant from the John Templeton Foundation (the opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation). REFERENCES [1] G. Canfora, M. Di Penta, R. Esposito, and M. L. Villani, An approach for qos-aware service composition based on genetic algorithms, in Proceedings of the 2005 conference on Genetic and evolutionary computation, ser. GECCO 05. New York, NY, USA: ACM, 2005, pp. 1069 1075. [Online]. Available: http://doi.acm.org/10.1145/1068009.1068189 [2] L. Li, P. Cheng, L. Ou, and Z. Zhang, Applying multi-objective evolutionary algorithms to qos-aware web service composition, in Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II, ser. ADMA 10. Berlin, Heidelberg: Springer-Verlag, 2010, pp. 270 281. [Online]. Available: http://portal.acm.org/citation.cfm?id=1948448.1948477 [3] Y. Yao and H. Chen, Qos-aware service composition using nsga-ii1, in Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human, ser. ICIS 09. New York, NY, USA: ACM, 2009, pp. 358 363. [Online]. Available: http://doi.acm.org/10.1145/1655925.1655991 [4] M. Harman, The current state and future of search based software engineering, in 2007 Future of Software Engineering, ser. FOSE 07. Washington, DC, USA: IEEE Computer Society, 2007, pp. 342 357. [Online]. Available: http://dx.doi.org/10.1109/fose.2007.29 [5] Y. Vanrompay, P. Rigole, and Y. Berbers, Genetic algorithm-based optimization of service composition and deployment, in Proceedings of the 3rd international workshop on Services integration in pervasive environments, ser. SIPE 08. New York, NY, USA: ACM, 2008, pp. 13 18. [Online]. Available: http://doi.acm.org/10.1145/1387309.1387313 [6] X. Liu, Z. Xu, and L. Yang, Independent global constraints-aware web service composition optimization based on genetic algorithm, Intelligent Information Systems, IASTED International Conference on, vol. 0, pp. 52 55, 2009. [7] D. E. Comes, H. Baraki, R. Reichle, M. Zapf, and K. Geihs, Heuristic approaches for qos-based service selection. in ICSOC, ser. Lecture Notes in Computer Science, P. P. Maglio, M. Weske, J. Yang, and M. Fantinato, Eds., vol. 6470, 2010, pp. 441 455. [Online]. Available: http://dblp.uni-trier.de/db/conf/icsoc/icsoc2010.html#comesbrzg10 [8] L. Zeng, B. Benatallah, A. H.H. Ngu, M. Dumas, J. Kalagnanam, and H. Chang, Qos-aware middleware for web services composition, IEEE Trans. Softw. Eng., vol. 30, pp. 311 327, May 2004. [Online]. Available: http://portal.acm.org/citation.cfm?id=987527.987638 [9] H. Taboada, J. Espiritu, and D. Coit, Moms-ga: A multi-objective multi-state genetic algorithm for system reliability optimization design problems, Reliability, IEEE Transactions on, vol. 57, no. 1, pp. 182 191, march 2008. [10] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, A fast and elitist multiobjective genetic algorithm: Nsga-ii, IEEE Transactions on Evolutionary Computation, vol. 6, pp. 182 197, 2002. [11] E. Zitzler, M. Laumanns, and L. Thiele, SPEA2: Improving the strength pareto evolutionary algorithm, Swiss Federal Institute of Technology (ETH) Zurich, Gloriastrasse 35, CH-8092 Zurich, Switzerland, Tech. Rep. 103, 2001. [Online]. Available:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.17.2440 [12] K. D. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, A fast and elitist multiobjective genetic algorithm : NSGA-II, IEEE Transactions on Evolutionary Computation, vol. 6, pp. 182 197, 2002. [Online]. Available: http://dx.doi.org/10.1109/4235.996017 [13] S. Kukkonen and J. Lampinen, Gde3: the third evolution step of generalized differential evolution, in IEEE Congress on Evolutionary Computation, 2005, pp. 443 450. [14] C. Chang, D. Xu, and H. Quek, Pareto-optimal set based multiobjective tuning of fuzzy automatic train operation for mass transit system, IEE Proceedings - Electric Power Applications, vol. 146, no. 5, pp. 577 583, 1999. [Online]. Available: http://link.aip.org/link/?iep/146/577/1 [15] L. V. Santana-Quintero and C. A. C. Coello, An algorithm based on differential evolution for multi-objective problems, International Journal of Computational Intelligence Research, vol. 1, no. 2, pp. 151 169, 2005. [16] K. Deb, M. Mohan, and S. Mishra, Towards a quick computation of well-spread pareto-optimal solutions, in EMO, 2003, pp. 222 236. [17] J. Lampinen and I. Zelinka, Mechanical engineering design optimization by differential evolution. Maidenhead, UK, England: McGraw-Hill Ltd., UK, 1999, pp. 127 146. [Online]. Available: http://portal.acm.org/citation.cfm?id=329055.329094 [18] T. Okabe, A critical survey of performance indices for multi-objective optimisation, in Proc. of 2003 Congress on Evolutionary Computation. IEEE Press, 2003, pp. 878 885. [19] C. A. Coello Coello, G. B. Lamont, and D. A. Van Veldhuizen, Evolutionary Algorithms for Solving Multi-Objective Problems (Genetic and Evolutionary Computation), 2nd ed. Springer, Sep. 2007. [Online]. Available: http://www.worldcat.org/isbn/0387332545