QAR-CIP-NSGA-II: A New Multi-Objective Evolutionary Algorithm to Mine Quantitative Association Rules

Size: px
Start display at page:

Download "QAR-CIP-NSGA-II: A New Multi-Objective Evolutionary Algorithm to Mine Quantitative Association Rules"

Transcription

1 QAR-CIP-NSGA-II: A New Multi-Objective Evolutionary Algorithm to Mine Quantitative Association Rules D. Martín a, A. Rosete a, J. Alcalá-Fdez b,, F. Herrera b a Dept. Artificial Intelligence and Infrastructure of Informatic Systems, Higher Polytechnic Institute José Antonio Echeverría, Cujae, La Habana, Cuba b Department of Computer Science and Artificial Intelligence, University of Granada, CITIC-UGR, Granada, Spain Abstract Some researchers have framed the extraction of association rules as a multi-objective problem, jointly optimizing several measures to obtain a set with more interesting and accurate rules. In this paper, we propose a new multi-objective evolutionary model which maximizes the comprehensibility, interestingness and performance of the objectives in order to mine a set of quantitative association rules with a good trade-off between interpretability and accuracy. To accomplish this, the model extends the well-known Multi-objective Evolutionary Algorithm Non-dominated Sorting Genetic Algorithm II to perform an evolutionary learning of the intervals of the attributes and a condition selection for each rule. Moreover, this proposal introduces an external population and a restarting process to the evolutionary model in order to store all the nondominated rules found and improve the diversity of the rule set obtained. The results obtained over real-world datasets demonstrate the effectiveness of the proposed approach. Keywords: Data Mining, Quantitative Association Rules, Multi-Objective Evolutionary Algorithms, NSGA-II 1. Introduction Association discovery is one of the most common Data Mining (DM) techniques used to extract interesting knowledge from large datasets [34]. Association rules identify dependencies between items in a dataset [65] and are defined as an expression of the type X Y, where X and Y are sets of items and X Y = [1, 2]. Many previous studies for mining association rules focused on datasets with binary or discrete values, however the data in real-world applications usually consists of quantitative values. Thus, designing DM algorithms able to deal with various types of data is a challenge in this field [6, 13, 36, 56, 61]. A commonly used method to handle continuous domains in the extraction of association rules is to partition the domains of the attributes in to intervals. For instance, an association rule could be Income [1200, 2000] MortgageExpenses [360, 600]. These kinds of rules are known as quantitative association rules (QARs) [56]. In recent years, Evolutionary Algorithms (EAs), particularly Genetic Algorithms (GAs) [23], have been used by many researchers to mine QARs from datasets with quantitative values [4, 8]. The main motivation for applying GAs to knowledge extraction tasks is that they are robust and adaptive search algorithms that perform a global search in place of candidate solutions (for instance, rules or other forms of knowledge representation). Recently, some researchers have presented the extraction of association rules as a multi-objective problem (instead of single objective), removing some of the limitations of the current approaches. Several objectives are considered in the process of extracting association rules, obtaining a set with more interesting and accurate rules [5, 33]. In this way, we can jointly optimize measures such as support, confidence, and so on, which can present different degrees of trade-off depending on the dataset used and the type of information that can be extracted from it. Since this approach Corresponding author. Tel Ext addresses: dmartin@ceis.cujae.edu.cu (D. Martín), rosete@ceis.cujae.edu.cu (A. Rosete), jalcala@decsai.ugr.es (J. Alcalá-Fdez ), herrera@decsai.ugr.es (F. Herrera) Preprint submitted to Information Sciences September 5, 2013

2 has a multi-objective nature, the use of Multi-Objective Evolutionary Algorithms (MOEAs) [14, 18] to obtain a set of solutions with different degrees of trade-off between the different measures could represent an interesting way of working (by considering these measures as objectives). In this paper, we propose a new multi-objective evolutionary model to mine a set of QARs with a good tradeoff between interpretability and accuracy which maximizes three objectives: comprehensibility, interestingness and performance, understanding by performance the product of Certainty Factor (CF) [54] and support. To accomplish this, the model (called QAR-CIP) extends the well-known MOEA Non-dominated Sorting Genetic Algorithm II (NSGA-II) [19] to perform an evolutionary learning of the intervals of the attributes and a condition selection for each rule, therefore the algorithm is called QAR-CIP-NSGA-II. Moreover, this proposal introduces an external population (EP) and a restarting process to the evolutionary model in order to store all the nondominated rules found and promote diversity in the population. Notice that this proposal follows a dataset-independent approach which does not rely on the minimum support (minsup) and the minimum confidence (minconf) thresholds, which are hard to determine for each dataset. In order to assess the performance of the proposed approach, we have presented an experimental study using 9 real-world datasets. We have developed the following studies. First, we have compared our approach with the original evolutionary model of NSGA-II in order to analyze the performance of the new components introduced. Second, we have compared the performance of our approach with four mono-objective approaches and three MOEAs to mine QARs. Third, we have shown the results obtained from the comparison with two other classical approaches for mining association rules. Furthermore, in these studies, we have made use of some nonparametric statistical tests for the pairwise and multiple comparison [21, 30, 29, 31] of the performance of these approaches over 22 real-world datasets. Finally, we have analyzed the scalability of the proposed approach. This paper is arranged as follows. Section 2 introduces a brief study of the existing MOEAs for general purposes [67], some basic definitions of QARs and some quality measures. Section 3 details the evolutionary learning components proposed to mine a set of high quality QARs. Section 4 shows and discusses the results that are obtained over 9 real-world datasets. Section 5 presents some concluding remarks. Finally, Appendix A shows the results obtained by the analyzed algorithms in the 22 real-world datasets considered for the statistical analysis. 2. Preliminaries In this section, we first introduce the basic definitions of QARs and some quality measures. Then, we present a brief study of MOEAs Quantitative association rules Association rules are used to represent and identify dependencies between items in a dataset [34, 65]. As we mentioned above, they are expressions of the type X Y, where X and Y are sets of items, and X Y =. There are many previous studies of mining association rules that are focused on datasets with binary or discrete values; however, data in real-world applications usually consist of quantitative values. When the domain is continuous, the association rules are known as QARs, in which each item is a pair attribute-interval. For instance, a QAR could be Age [30, 52] and S alary [3000, 3500] NumCars [3, 4] Support and Confidence are the most common measures to assess association rules. These measures for a rule X Y are defined as: S upport(x Y) = S UP(XY) D (1) Con f idence(x Y) = S UP(XY) S UP(X) where S UP(XY) is the number of patterns of the dataset covered by the antecedent and consequent of the rule, S UP(X) is the number of patterns of the dataset covered by the antecedent of the rule and D is the number of patterns in the dataset. The classic techniques for mining association rules attempt to discover rules whose support and confidence are greater than the user-defined thresholds minsup and minconf. However, several authors have pointed out some 2 (2)

3 drawbacks of this framework that lead it to find many more rules than it should [10, 12, 55]. For instance, confidence is unable to detect statistical independence or negative dependence between items because it does not take into account the support of the consequent. Moreover, itemsets with very high support are a source of misleading rules because they appear in most of the transactions, and hence any itemset (despite its meaning) seems to be a good predictor of the presence of the high-support itemset. In recent years, several researchers have proposed other measures to select and rank patterns according to their potential interest to the user [3, 12, 32, 48, 50, 54]. We briefly describe some of them below. The conviction [12] measure analyzes the dependency between X and Y, where Y means the absence of Y. Its domain is [0, ), where values less than 1 represent negative dependence, 1 represents independence and values higher than 1 represent positive dependence. However, it is not easy to compare the conviction of rules because its domain is not bounded, making it difficult to define a conviction threshold. Conviction for a rule X Y is defined as: Conviction(X Y) = S UP(X)S UP( Y) S UP(X Y) The lift [50] measure represents the ratio between the confidence of the rule and the expected confidence of the rule. Its domain is [0, ), where values less than 1 imply negative dependence, 1 implies independence and values higher than 1 imply positive dependence. As with conviction, its range is not bounded which makes it difficult to define a lift threshold. Lift for a rule X Y is defined as: Li f t(x Y) = Con f idence(x Y) S UP(Y) = S UP(XY) S UP(X)S UP(Y) CF [54] is interpreted as a measure of variation of the probability that Y is in a transaction when we consider only those transactions where X is present. The uses of this measure prevent the discovery of misleading rules that are not detected by confidence. Its domain is [-1,1], where values less than 0 represent negative dependence, 0 represents independence and values higher than 0 represent positive dependence. CF for a rule X Y is defined in three ways depending on whether Con f idence(x Y) is less, greater than or equal to S UP(Y). if Con f idence(x Y) > S UP(Y) (3) (4) CF(X Y) = Con f idence(x Y) S UP(Y) 1 S UP(Y) (5) if Confidence(X Y) < SUP(Y) CF(X Y) = Con f idence(x Y) S UP(Y) S UP(Y) (6) Otherwise CF(X Y) = 0 (7) The netconf [3] measure evaluates the interest of the association rules, which is based on the support of the rule and its antecedent and consequent support. As with the CF measure, the netconf can detect misleading rules that are not detected by confidence. This measure obtains values in [-1,1], where positive values represent positive dependence, negative values represent negative dependence, and 0 represents independence. Netconf for a rule X Y is defined as: Netcon f = S UP(XY) S UP(X)S UP(Y) S UP(X)(1 S UP(X)) (8) 2.2. Multi-objective evolutionary algorithms for general purposes EAs simultaneously deal with a set of possible solutions (the so-called population) which enables them to find several members of the Pareto optimal set in a single run of the algorithm. Additionally, they are not too susceptible to the shape or continuity of the Pareto front (e.g., they can easily deal with discontinuous and concave Pareto fronts) [58, 67]. 3

4 Reference MOEA 1 st Gen. 2 nd Gen. [26] MOGA [37] NPGA [57] NSGA [22] Hybrid MOEAs [68] Indicator based MOEAs [41] Memetic MOEAs [15] Micro-GA [66, 43] MOEA/D & MOEA/D-DE [20] MOEAs based on coevolution [26, 59] MOEAs based on reference [23] NPGA 2 [19] NSGA-II [40] PAES [16, 17] PESA & PESA-II [69, 70] SPEA & SPEA2 Table 1: Classification of MOEAs The first work to hint at the possibility of using EAs to solve a multi-objective problem was a Ph.D. thesis of 1967 [51] in which, however, no actual MOEA was developed (the multi-objective problem was restated as a singleobjective problem and solved with a GA). David Schaffer is normally considered to be the first to have designed an MOEA in the mid-1980s [52]. Schaffer s approach, called Vector Evaluated Genetic Algorithm (VEGA) consists of a simple GA with a modified selection mechanism. However, VEGA had a number of problems, of which the main one was its inability to retain solutions with acceptable performances; although perhaps above average, they were not outstanding for any of the objective functions. After VEGA, the researchers designed a first generation of MOEAs characterized by their simplicity, whereby the main lesson learned was that successful MOEAs had to combine a good mechanism to select non-dominated individuals (perhaps, but not necessarily, based on the concept of Pareto optimality) with a good mechanism to maintain diversity (fitness sharing was one possibility, but not the only one). The most representative MOEAs of this generation are the following: Nondominated Sorting Genetic Algorithm (NSGA) [57], Niched-Pareto Genetic Algorithm (NPGA) [37] and Multi- Objective Genetic Algorithm (MOGA) [26]. A second generation of MOEAs began when elitism became a standard mechanism. In fact, the use of elitism is a theoretical requirement in order to guarantee the convergence of an MOEA. Many MOEAs have been proposed during the second generation. However, most researchers would agree that few of these approaches have been adopted as a reference or have been used by others. In this way, the Strength Pareto Evolutionary Algorithm 2 (SPEA2) [69] and the NSGA-II [19] can be considered to be the most representative MOEAs of the second generation, with others, such as the Pareto Archived Evolution Strategy (PAES) [40], the Multi-objective Evolutionary Algorithm Based on Decomposition (MOEA/D and MOEA/D-DE) [66, 43], MOEAs based on reference [26, 59], Indicator based MOEAs [68], Hybrid MOEAs [22], Memetic MOEAs [41] and MOEAs based on coevolution [20] also of interest. Table 1 shows a resume of the most representative MOEAs of both generations. Finally, it should be noted that today NSGA-II is a paradigm within the MOEA research community, as the powerful crowding operator that this algorithm uses usually enables the widest Pareto sets to obtain a wide variety of problems, which is a much appreciated property within in this framework. 3. A new multiobjective based algorithm to mine quantitative association rules: QAR-CIP-NSGA-II In this section, we will describe our proposal to obtain a set of QARs with a good trade-off between interpretability and accuracy considering three objectives: comprehensibility, interestingness and performance. This model considers the use of the NSGA-II algorithm [19] in the performance of an evolutionary learning of the rules and introduces two new components to its evolutionary model: an EP and a restarting process. In the following, we will explain in detail all of their characteristics (see subsections ) and present a flowchart of the algorithm (see subsection 3.7). 4

5 3.1. Evolutionary multiobjective model In this paper, this proposal extends the well-known NSGA-II algorithm and introduces an EP and a restarting process to the evolutionary model in order to store all the nondominated rules found, to provoke diversity in the population and to improve the coverage of the datasets. The EP will keep all the nondominated rules found and be updated at the end of each generation with the nondominated rules of the current population. The redundant nondominated rules will be removed from the EP in order to avoid the overlapping rules. A rule is considered redundant if the intervals of all its variables are contained within the intervals of the variables of another rule. Moreover, the size of the EP is not limited in order to obtain a larger number of rules of the Pareto front and to reduce the size of the population (independently of the size of the problem), which helps to better control the method s convergence. In order to provoke diversity in the population the restarting process will be applied when the number of new individuals of the population in one generation is less than α% of the size of the current population (with α determined by the user, usually 5%). In this case, the examples covered by the rules in the EP are marked and the process of initialization of the population is again applied from examples uncovered (see subsection 3.6), allowing us to perform a good exploration of the search space. Finally, the EP is updated with the new population. Notice that both components are complementary. The restarting process uses the examples uncovered by the rules from the EP to generate the new population. In addition, the EP keeps all the nondominated rules found until the last moment, preventing the rules from being removed when the restarting process restarts the whole population. With these modifications, the evolutionary model would be as follows. First our proposal generates an initial population and initializes the EP with nondominated rules from the initial population. Then an offspring population is generated from the current population by selection, crossover and mutation. The next population is constructed from the current and offspring populations, the EP is updated with the current population and, if necessary, the restarting process is applied. When the number of new individuals in the next population is less than α% the restarting process is applied. This process is iterated until a stopping condition is satisfied. The NSGA-II algorithm has two features, which make it a high-performance MOEA. One is the fitness evaluation of each solution based on the Pareto ranking and a crowding measure, and the other is an elitist generation update procedure. Each solution in the current population is evaluated in the following manner. First, Rank 1 is assigned to all nondominated solutions in the current population. All solutions with Rank 1 are tentatively removed from the current population. Next, Rank 2 is assigned to all non-dominated solutions in the reduced current population. All solutions with Rank 2 are tentatively removed from the reduced current population. This procedure is iterated until all solutions are tentatively removed from the current population (i.e., until ranks are assigned to all solutions). As a result, a different rank is assigned to each solution. Solutions with smaller ranks are viewed as being better than those with larger ranks. Among solutions with the same rank, an additional criterion called a crowding measure is taken into account. The crowding measure for a solution calculates the distance between adjacent solutions with the same rank in the objective space. Less crowded solutions with larger values of the crowding measure are viewed as being better than more crowded solutions with smaller values of the crowding measure. A pair of parent solutions are selected from the current population by binary tournament selection based on the Pareto ranking and the crowding measure. When the next population is to be constructed, the current and offspring populations are combined into a merged population. Each solution in the merged population is evaluated in the same manner as in the selection phase of parent solutions using the Pareto ranking and the crowding measure. The next population is constructed by choosing a specified number (i.e., population size) of the best solutions from the merged population Coding scheme and initial gene pool Each chromosome is a vector of genes that represent the attributes and intervals of the rule. We have used a positional encoding, where the i-th attribute is encoded in the i-th gene used. To combine the condition selection with the learning of the intervals, each gene consists of three parts: The first part (ac) represents when a gene is involved or not in the rule. When this part is -1, this attribute is not involved in the rule, and when this part is 0 or 1 this attribute is part of the antecedent or consequent of the rule, respectively. All genes that have 0 in their first parts will form the antecedent of the rule while genes that have 1 will form the consequent of the rule. 5

6 Chromosome Gene1 Gene2 Genem ac1 lb1 ub1 ac2 lb2 ub2. acm lbm ubm Figure 1: Scheme of a chromosome The second part represents the lower bound (lb) of the interval of the attribute. The third part represents the upper bound (ub) of the interval of the attribute. Notice that lb and ub will be equal in the intervals of nominal attributes. Finally, a chromosome C T is coded in the following way, where m is the number of attributes in the dataset. Figure 1 shows the scheme of a chromosome. Gene i = (ac i, lb i, ub i ), i = 1,..., m, C T = Gene 1 Gene 2... Gene m In order to avoid the intervals increasing until they span the total domain, we define amplitude as the maximum size the interval of a determined attribute can attain. Thus, the amplitude of an attribute i is defined as: Amplitude i = (Max i Min i )/δ (9) where Max i and Min i are the maximum and minimum values of the domain of attribute i respectively, and δ is a value given by the system expert that determines the tradeoff between the generalization and specificity of the rules. The initial population will consist of a rule set with a good coverage of the dataset and with only one attribute in the consequent, since in this paper we will only consider these kinds of rules (however, this coding scheme allows us to deal with more than one attribute in the consequent). To generate the initial population, first we select at random the attributes that will be part of the antecedent and consequent of the rules (at least one attribute will be selected for the antecedent and for the consequent). Then we select at random an unmarked pattern from the dataset and generate the interval of each attribute with a size equal to 50% of the Amplitude i of each attribute and with the values of the pattern selected in the center of each of them. If some bound of the intervals exceeds the domain of the attribute this will be replaced by the bound of the domain. Finally, the patterns covered for this rule are marked in the dataset. This process is repeated until the initial population is completed. Notice that, if all the patterns are marked and the initial population is not completed, then all the patterns will be unmarked again and the process will be repeated until the initial population is completed. For instance, let us consider a simple dataset with three attributes X 1, X 2 and X 3, six training patterns and δ = 2. Table 2 shows the six training patterns, the lower and upper bound of the domain of attribute and 50% of the Amplitude i of each attribute. Let us suppose that we select at random the pattern ID3 and the attributes X 1 and X 2 for ID X 1 X 2 X 3 ID ID ID ID ID ID Lower bound of the domain Upper bound of the domain % of the Amplitude i Table 2: Six patterns in this example. 6

7 the antecedent and consequent of the rule respectively. In this iteration, the rule X 1 [0.77, 1.0] X 2 [10, 17] is generated, calculating its intervals as follows: { lb 1 = max } { 2, 0.0 = 0.77 ub 1 = min } 2, 1.0 = 1.0 lb 2 = max { }, 10 = 10 ub 2 = min { }, 50 = 17 { lb 3 = max } { 2, 3.2 = 3.2 ub 3 = min } 2, 70.8 = The 50% of Amplitude i is divided by 2 in order to set the value of the pattern ID3 in the center of the generated intervals. Notice that, ub 1 is 1.0 because when the bound of the intervals exceeds the domain of the attribute this is replaced by the upper/lower bound of the domain. Fig. 2 shows the whole chromosome generated in this example. X 1 X 2 X 3 Gene 1 Gene 2 Gene 3 ac lb ub ac lb ub ac lb ub Figure 2: Chromosome obtained for the example This rule covers the training patterns ID1 and ID3 in Table 2. In this situation, these patterns are marked in the dataset. The EP is initialized with nondominated rules of the initial population Objectives Three objectives are maximized for this problem: Interestingness, Comprehensibility and Performance. Performance is the result of the product of CF and support (see subsection 2.1) which allows us to obtain accurate rules and a good trade-off between local and general rules. Notice that we are interested only in very strong rules [10], which represent a positive dependence between items and solve the support drawback. Thus, a rule X Y must verify: CF(X Y) > 0 S upport(x Y) > minsup (S upport(x Y) > (1 - minsup)) This measure can obtain values in the interval [0, 1]. A rule with a performance value near to 1 may be more useful to the user. Interestingness measures how interesting the rule is, which allows us to extract only those rules that may be of interest to the users. In this case, we have used the interestingness measure lift [50] (see subsection 2.1). Finally, comprehensibility tries to quantify the ease with which the rule can be understood [24]. The generated rules may have a large number of attributes involved, thereby making them difficult to understand. If the generated rules are not understandable to the user, the user will never use them. Here, the comprehensibility of a rule X Y is measured by the number of attributes involved in the rule and is defined as: Comprehensibility(X Y) = 1/Attr X Y (10) where Attr X Y is the number of attributes involved in the antecedent of the rule, as in this paper we will consider only rules with one attribute in the consequent. 7

8 3.4. Genetic operators The crossover operator generates two offspring by randomly interchanging the genes of the parents (exploration). Figure 3 shows a simple example of the performance of this operator. The mutation operator consists of randomly modifying the interval (lb and ub) and ac of a gene selected at random. This operator selects at random one of the bounds of the interval and randomly increases or decreases its value. We have to be particularly careful not to surpass the fixed value of amplitude. In that sense, the way that we modify the interval is similar to that calculated in the initialization process. The value for ac is randomly selected from within the set {-1,0,1}. Figure 3: A simple example of the crossover operator 3.5. Repairing operator After the mutation operator is applied, if any rule does not have an antecedent or consequent or has more than one attribute in the consequent, a repairing operator is used to modify these rules. If there is more than one attribute in the consequent, one attribute is randomly selected as the consequent from amongst them and the remaining attributes are passed to the antecedent in order to maintain rules with only one attribute in the consequent. If there is no attribute in the antecedent and/or consequent these are randomly selected from among the attributes not involved. Finally, the sizes of the intervals are decreased until the number of patterns covered is smaller than the number of patterns covered by the original intervals in order to obtain more simple rules Restarting process To get away from local optima, this algorithm uses a restarting process. This approach marks all the patterns covered by the rules from the EP and applies the process of initialization to the population again (see subsection 3.2) in order to restart the population from examples uncovered by the rules in the EP. Moreover, the EP is updated with the new population following the non-dominance criteria. This restarting process will be applied when the number of new individuals in the population in one generation is less than α% of the size of the current population Flowchart of the algorithm According to the above description, the proposed algorithm for mining QARs can be summarized in the following flowchart. Input: population size N, number of evaluations ntrials, probability of mutation P mut, factor of amplitude for each attribute of the dataset δ, difference threshold α. Output: EP Step 1: Initialize (a) Generate the initial population (P t ) with N chromosomes. (b) Evaluate the initial population (c) Generate all non-dominated fronts F = (F 1, F 2,...) of initial population and calculate crowding-distance in F i. (d) Initialize the EP Step 2: Generate the offspring population (Q t ) as follows. (a) Select a pair of parent solutions by binary tournament selection based on the Pareto ranking and the crowding measure. 8

9 (b) Each pair is crossed, generating two offspring, this operator interchanges the genes of the parents. Next, the mutation and repairing operators are applied for the two offspring. (c) Evaluate the new individuals. Step 3: Generate the next population (P t+1 ) as follows. (a) Create merged population with P t and Q t. (b) Generate all non-dominated fronts F = (F 1, F 2,...) of the merged population and calculate crowdingdistance in F i. (c) Create P t+1 with the best chromosome from the merged population using the non-dominated fronts and crowding distance: Include the i-th non-dominated front in P t+1. Check the next front for inclusion. Sort in descending order using the crowded-comparison operator. Choose the first (N P t+1 ) elements of F i. Step 4: Update of the EP, following the non-dominance criteria. Step 5: If the difference between the current population and previous population is less than α%, restart the population. Step 6: If the maximum number of evaluations is not reached, go to Step 2. Step 7: Remove redundance in the EP, delete the chromosomes that are subchromosomes of others. A subchromosome is one in which the intervals of all its genes are contained within the intervals of the genes of another chromosome. Step 8: The EP is returned. 4. Experimental analysis Several experiments have been carried out in this paper to analyze the performance of our proposal. In order to present them, this section is organized as follows: 1. In subsection 4.1, we describe the real-world datasets that are used in these experiments. 2. In subsection 4.2, we introduce a brief description of the algorithms considered for comparison and we show the configuration of the algorithms (determining all the parameters used). 3. In subsection 4.3, we show an analysis of the new components introduced to the NSGA-II [19] evolutionary model. 4. In subsection 4.4, we compare the performance of our approach with four mono-objective evolutionary approaches (EARMGA [63], GAR [46], GENAR [45] and Alatasetal [4]) and three MOEAs (ARMMGA [49], MODENAR [5] and MOEA Ghosh [33]) for mining QARs. 5. In subsection 4.5, we compare our approach with two classical association rules extraction algorithms: Apriori [11, 56] and Eclat [64]. 6. In subsection 4.6, we analyze the scalability of the proposed approach. 9

10 Names Attributes(R/I/N) Patterns Balance Scale (bal) 4(4/0/0) 625 Basketball (bask) 5(3/2/0) 96 Bolts (bol) 8(2/6/0) 40 House 16H (hh) 17(10/7/0) Solar Flare (fla) 11(0/0/11) 1066 Pollution (pol) 16(16/0/0) 60 Quake (qua) 4(3/1/0) 2178 Stock Price (sto) 10(10/0/0) 950 Stulong (stu) 5(5/0/0) 1419 Table 3: Datasets considered for the experimental study 4.1. Datasets In order to analyze the performance of the proposed approach, we have considered 9 real-world datasets. Table 3 summarizes the main characteristics of the 9 datasets, which are available in the repository KEEL-dataset [7] from which they can be downloaded (Available at where Attributes(R/ I/N) is the number of (Real/Integer/ Nominal) attributes in the data and Patterns is the number of patterns. To develop the different experiments, we consider the average results of 5 runs for each dataset Association rules extraction algorithms considered for the comparison and parameters of the algorithms In these experiments, we compare the proposed approach with nine other algorithms, which are available from the KEEL software tool [9]. A brief description of these algorithms follows. 1. Apriori [11, 56]: The main aim of Apriori is to exploit the search space by means of the downward closure property. The latter states that any subset of a frequent itemset must also be frequent. As a consequence, it generates candidates for the current iteration by means of itemsets considered frequent at the previous iteration. Then it enumerates all the subsets for each transaction and increments the support of candidates matching them. Then, those having the user-specified minsup are marked as frequent for the next iteration. This process is repeated until all the frequent itemsets have been found. Thus, Apriori follows a breadth-first strategy to generate candidates. Finally, Apriori uses the frequent itemsets to generate rules with a confidence greater than minconf. 2. Eclat [64]: Eclat employs a depth-first strategy. It generates candidates by extending the prefixes of an itemset until an infrequent one is found. In such case, it simply backtracks to the previous prefix and then recursively applies the above procedure. Unlike Apriori, the support counting is achieved by adopting a vertical layout. That is, for all the items in a dataset, it first constructs a list of all the transaction identifiers (tid-list) containing that item. Then it counts the support by simply intersecting two or more tid-lists to check whether they have items in common. If this is the, the support is equal to the size of the resulting set. The process for generating the rules is the same as Apriori. 3. Evolutionary Association Rules Mining with Genetic Algorithm (EARMGA) [63]: This algorithm uses a GA to identify QARs, which does not require a user-specified threshold for minsup. Each chromosome encodes a generalized k-rule, where k indicates the desired length. The most interesting rules are returned according to the interestingness measure defined by the fitness function, which is based on the support of the rule and its antecedent and consequent support. 4. GENetic Association Rules(GENAR) [45]: This algorithm mines association rules in numeric datasets by using a GA. Each chromosome encodes an association rule, containing the maximum and minimum intervals of each numeric attribute. The length of the rules is always fixed to the number of attributes, only the last attribute forms the consequent. The objective function considers the number of records covered by the rule and punishes those which have already covered the same records in the dataset. 10

11 5. Genetic Association Rules (GAR) [46]: This algorithm is an extension of GENAR [45], which searches frequent itemsets in numeric datasets without needing to discretize the attributes. Each chromosome is a k-itemset, where each gene represents the maximum and minimum values of the attributes that belong to the k-itemset. This algorithm finds frequent itemsets and it is therefore necessary to run another procedure afterwards in order to generate association rules. 6. Genetic algorithm for automated mining of both positive and negative QARs (Alatasetal) [4]: This algorithm designs a GA to simultaneously search for intervals of quantitative attributes and to discover positive and negative QARs that these intervals conform to in a single run. The chromosomes represent rules, in which each gene has four parts. The first part represents the antecedent or consequent of the rule, the second represents the positive or negative ARs, the third and fourth represent the lower and upper bound of the item interval respectively. The proposed GA performs a dataset-independent approach that does not rely upon the minsup and minconf thresholds. 7. Multi-objective association rules with genetic algorithms (ARMMGA) [49]: This algorithm is an MOEA based on the EARMGA algorithm to mine QARs without taking the minsup and minconf into account. According to the comments of the authors, the most important aspect of this algorithm is that its fitness function only specifies the order of chromosomes in the population and does not have any other effect on the GA operator, using this order as a selection criterion. This algorithm returns the rules found with a better correlation between support and confidence. 8. Multi-objective differential evolution algorithm for mining numeric association rules (MODENAR) [5]: This algorithm uses a multi-objective differential evolution algorithm based on Alatasetal [4] to mine accurate and comprehensible QARs without specifying minsup and minconf. This algorithm uses the same coding scheme for the chromosomes as Alatasetal but without the second part. MODENAR weights four objectives to improve the quality objectives of the rules: support, confidence, comprehensibility and amplitude of the domain of the attributes. 9. Multi-objective rule mining using genetic algorithms (MOEA Ghosh) [33]: This algorithm uses a Pareto based GA to extract some useful and interesting rules from any dataset. Each chromosome represents an association rule, in which bits are associated to each attribute, which indicate the antecedent or consequent of the rule, the absence or presence of the attribute and the relational operators involved with the attribute. It uses three measures: comprehensibility, interestingness and predictive accuracy to solve the multi-objective rule mining problem. A separate population is used, which contains those chromosomes which are non-dominated from amongst the current population as well as from amongst the non-dominated solutions of the previous generation. The parameters of the analyzed algorithms are presented in Table 4. With these values for our proposal, we have tried to facilitate comparisons, selecting standard common parameters that work well in most cases instead of searching for very specific values. We have selected 2.0 for δ since this is a value that works well in most cases, instead of searching for a highly specific value for each dataset. The parameters of the remaining algorithms were selected according to the recommendations of the corresponding authors of each proposal, which are the default parameter settings included in the KEEL software tool [9]. Notice that the length of the rules for EARMGA and ARMMGA is higher in the dataset House 16H, because the number of attributes and transactions is higher in this problem. Finally, for all the experiments conducted in this study, the results shown in the tables for the multi-objective algorithms always refer to nondominated association rules Analysis of the new components introduced to the evolutionary multiobjective algorithm In this section we study the performance of our proposal against the classical approach of NSGA-II (QAR-CIP- NSGA-II Classic) in order to analyze the performance of the new components introduced to the NSGA-II evolutionary model: the EP and restarting process. The results obtained by the analyzed algorithms are shown in Table 5, where #R is the number of association rules generated, Av S up and Av Con f are, respectively, the average support and the average confidence of the rules, Av Li f t is the average value for the measure lift of the rules, Av Conv is the average value for the measure conviction of the rules, Av CF is the average value for the measure CF of the rules, Av NetCon f 11

12 Algorithms Parameters Apriori minsup = 0.1, minconf = 0.8 Eclat minsup = 0.1, minconf = 0.8 Alatasetal N eval =50000, ninitialrandomchromo=12, r = 3, TournamentSize = 10, P sel = 0.25, P cro = 0.7, P mut min = 0.05, P mut max = 0.9, W sup = 5, W con f = 20, W amplrule = 0.05, W amplinterv = 0.02, W covered = 0.01 EARMGA PopSize = 100, N eval = 50000, k = 2 (3 with HH), P sel = 0.75, P cro = 0.7, P mut = 0.1, α = 0.01 GAR PopSize = 100, nitemset = 100, N eval = 50000, P sel = 0.25, P cro = 0.7, P mut = 0.1, ω = 0.4, Ψ= 0.7, µ= 0.5, minsup = 0.1, minconf = 0.8 GENAR PopSize = 100, N eval = 50000, P sel = 0.25, P cro = 0.7, P mut = 0.1, nrules = 30, FP = 0.7, AF = 0.2 ARMMGA PopSize=100, N eval =50000, k=2 (3 with HH), P sel =0.95, P cro =0.85, P mut = 0.01, db=0.01 MODENAR PopSize = 100, N eval =50000, Threshold= 60, CR = 0.3, W sup = 0.8, W con f = 0.2, W comp = 0.1, W amplinterv = 0.4 MOEA Ghosh PopSize = 100, N eval =50000, PointCrossover=2, P cro =0.8, P mut = 0.02 QAR-CIP-NSGA-II PopSize = 100, N eval =50000, P mut = 0.1, δ=2, α = 5 Table 4: Parameters considered for the comparison is the average value for the measure netconf of the rules, Av Amp is the average length of the rules in terms of the attributes involved, Av HyperVol is the average value for the measure hypervolume [71], and %Tran is the percentage of transactions covered by the rules on the total patterns in the dataset. The values for the hypervolume measure have been calculated using the R package emoa 1 [47], which implements Fonseca et al s algorithm proposed in [27]. The values shown in the table represent the maximum value for these measures. We can present the following conclusions based on an analysis of the results presented in Table 5: The EP allows us to obtain a greater number of nondominated rules of the Pareto front due to the fact that the number of rules is not limited by the size of the current population, with each rule providing us with interesting knowledge pertaining to the dataset. The restarting process along with the EP allow us to perform a good exploration of the search space, improving the coverage of the datasets. QAR-CIP-NSGA-II presents higher values of hypervolume than the classic approach, obtaining a greater nondominated area. Notice that these values are very high because the range of the measure lift is not bounded. The rules obtained by our proposal present improvements in almost all the interestingness measures and a similar or higher coverage in all the datasets, showing a positive synergy between the new components. In order to assess whether significant differences exist among the results, we adopt statistical analysis [30, 29, 31] and, in particular, nonparametric tests, according to the recommendations made in [21]. To do this, we have considered 22 real-world datasets. The main characteristics of these datasets and the results obtained by the analyzed algorithms are presented in Appendix A. We decided to apply the statistical tests to the average results obtained for the interestingness measures lift, CF and netconf. Notice that we have not used the conviction measure because the algorithms obtain infinity in most of the datasets. In order to compare the two algorithms we use Wilcoxon s Signed- Ranks test [53, 62]. Wilcoxon s test is based on computing the differences between two sample means (typically, mean test errors obtained by a pair of different algorithms in different datasets). In the classification framework these differences are well defined since these errors are in the same domain. In our case, to have well defined differences in the interestingness measures used, we propose the adoption of its mean values to a MeanS, which are defined for each measure as: 1 package emoa is available from Comprehensive R Archive Network (CRAN) at 12

13 Algorithm #R Av S up Av Con f Av Li f t Av Conv Av CF Av NetCon f Av Amp Av HyperVol %Tran Balance QAR-CIP-NSGA-II Classic QAR-CIP-NSGA-II Basketball QAR-CIP-NSGA-II Classic QAR-CIP-NSGA-II Bolts QAR-CIP-NSGA-II Classic QAR-CIP-NSGA-II Flare QAR-CIP-NSGA-II Classic QAR-CIP-NSGA-II House16H QAR-CIP-NSGA-II Classic QAR-CIP-NSGA-II Pollution QAR-CIP-NSGA-II Classic QAR-CIP-NSGA-II Quake QAR-CIP-NSGA-II Classic QAR-CIP-NSGA-II Stock QAR-CIP-NSGA-II Classic QAR-CIP-NSGA-II Stulong QAR-CIP-NSGA-II Classic QAR-CIP-NSGA-II Table 5: Results for all datasets in the comparison with classical NSGA-II For the measures CF and netconf: MeanS = meanvalue 2 i f meanvalue 0 meanvalue otherwise For the measure lift: MeanS = meanvalue i f meanvalue > meanvalue 2 i f 0 meanvalue 1 where meanvalue represents the mean value obtained for each measure in a dataset. MeanS obtains values in [0,1], where the worst value represents the independence value for each measure (see subsection 2.1), since it does not provide new knowledge to the user. Table 6 shows the results of Wilcoxon s test for the three measures. The hypothesis of equality for Wilcoxon s test has been rejected in all cases with a very small p-value and our proposal has achieved the highest rankings. Interestingness Measure Comparison R + R Hypothesis p-value CF QAR-CIP-NSGA-II vs. QAR-CIP-NSGA-II Classic Rejected < Netconf QAR-CIP-NSGA-II vs. QAR-CIP-NSGA-II Classic Rejected < Lift QAR-CIP-NSGA-II vs. QAR-CIP-NSGA-II Classic Rejected < Table 6: Results of Wilcoxon s Test (α = 0.05) in the comparison with classical NSGA-II 13

14 Figure 4: Pareto fronts obtained at different times of the evolutionary process in two datasets Fig. 4 shows the Pareto fronts obtained by our proposal when the restarting process is applied at different times of the evolutionary process in two datasets (the solutions of a single trial). In this figure, we plot the solutions from QAR-CIP-NSGA-II in a 3-D way and we plot the projections of these solutions on all the possible objective planes. We have modified the objectives in order for them to be plotted as minimization objectives. In order to retain all the information, the dominated solutions that are obtained from the projections have not been removed. We can see how the Pareto front improves along with different numbers of the restarting processes. Moreover, it can easily be seen from these figures how the EP and restarting process allow us to increase the number of non-dominated solutions for each restarting process Comparison with other mono-objective and multi objective evolutionary approaches This section analyzes the performance of our algorithm in comparison with four mono-objective algorithms (EAR- MGA [63], GAR [46], GENAR [45], Alatasetal [4]) and three MOEAs for mining QARs (ARMMGA [49], MOD- ENAR [5], MOEA Ghosh [33]). The results obtained by the analyzed algorithms are shown in Tables 7 and 8 (this kind of table was described in subsection 4.3). By the analysis of the results presented in these tables, we can highlight the following facts: The values obtained by our proposal for the measures lift, CF and netconf are better than the values obtained by the analyzed algorithms in all the datasets, with values close to the best possible value that these measures can achieve, allowing us to obtain an interesting set of association rules. The proposed algorithm presents a good trade-off between all the measures that have been analyzed. Moreover, the rule sets obtained have a low number of attributes, giving the advantage of easier understanding from a user s perspective and a high coverage of the dataset. 14

15 Algorithm #R Av S up Av Con f Av Li f t Av Conv Av CF Av NetCon f Av Amp %Tran Balance EARMGA GAR GENAR Alatasetal ARMMGA MODENAR MOEA Ghosh QAR-CIP-NSGA-II Basketball EARMGA GAR GENAR Alatasetal ARMMGA MODENAR MOEA Ghosh QAR-CIP-NSGA-II Bolts EARMGA GAR GENAR Alatasetal ARMMGA MODENAR MOEA Ghosh QAR-CIP-NSGA-II Flare EARMGA GAR GENAR Alatasetal ARMMGA MODENAR MOEA Ghosh QAR-CIP-NSGA-II House16H EARMGA GAR GENAR Alatasetal ARMMGA MODENAR MOEA Ghosh QAR-CIP-NSGA-II Table 7: Results for the datasets Balance, Basketball, Bolts, Flare and House16H 15

16 Algorithm #R Av S up Av Con f Av Li f t Av Conv Av CF Av NetCon f Av Amp %Tran Pollution EARMGA GAR GENAR Alatasetal ARMMGA MODENAR MOEA Ghosh QAR-CIP-NSGA-II Quake EARMGA GAR GENAR Alatasetal ARMMGA MODENAR MOEA Ghosh QAR-CIP-NSGA-II Stock EARMGA GAR GENAR Alatasetal ARMMGA MODENAR MOEA Ghosh QAR-CIP-NSGA-II Stulong EARMGA GAR GENAR Alatasetal ARMMGA MODENAR MOEA Ghosh QAR-CIP-NSGA-II Table 8: Results for the datasets Pollution, Quake, Stock, and Stulong 16

A Multi-Objective Evolutionary Algorithm for Mining Quantitative Association Rules

A Multi-Objective Evolutionary Algorithm for Mining Quantitative Association Rules A Multi-Objective Evolutionary Algorithm for Mining Quantitative Association Rules Diana Martín and Alejandro Rosete Dept. Artificial Intelligence and Infrastructure of Informatic Systems Higher Polytechnic

More information

Incorporation of Scalarizing Fitness Functions into Evolutionary Multiobjective Optimization Algorithms

Incorporation of Scalarizing Fitness Functions into Evolutionary Multiobjective Optimization Algorithms H. Ishibuchi, T. Doi, and Y. Nojima, Incorporation of scalarizing fitness functions into evolutionary multiobjective optimization algorithms, Lecture Notes in Computer Science 4193: Parallel Problem Solving

More information

Comparison of Evolutionary Multiobjective Optimization with Reference Solution-Based Single-Objective Approach

Comparison of Evolutionary Multiobjective Optimization with Reference Solution-Based Single-Objective Approach Comparison of Evolutionary Multiobjective Optimization with Reference Solution-Based Single-Objective Approach Hisao Ishibuchi Graduate School of Engineering Osaka Prefecture University Sakai, Osaka 599-853,

More information

Multi-objective Optimization

Multi-objective Optimization Jugal K. Kalita Single vs. Single vs. Single Objective Optimization: When an optimization problem involves only one objective function, the task of finding the optimal solution is called single-objective

More information

Evolving SQL Queries for Data Mining

Evolving SQL Queries for Data Mining Evolving SQL Queries for Data Mining Majid Salim and Xin Yao School of Computer Science, The University of Birmingham Edgbaston, Birmingham B15 2TT, UK {msc30mms,x.yao}@cs.bham.ac.uk Abstract. This paper

More information

Multi-objective Optimization

Multi-objective Optimization Some introductory figures from : Deb Kalyanmoy, Multi-Objective Optimization using Evolutionary Algorithms, Wiley 2001 Multi-objective Optimization Implementation of Constrained GA Based on NSGA-II Optimization

More information

Evolutionary Algorithms: Lecture 4. Department of Cybernetics, CTU Prague.

Evolutionary Algorithms: Lecture 4. Department of Cybernetics, CTU Prague. Evolutionary Algorithms: Lecture 4 Jiří Kubaĺık Department of Cybernetics, CTU Prague http://labe.felk.cvut.cz/~posik/xe33scp/ pmulti-objective Optimization :: Many real-world problems involve multiple

More information

Improving interpretability in approximative fuzzy models via multi-objective evolutionary algorithms.

Improving interpretability in approximative fuzzy models via multi-objective evolutionary algorithms. Improving interpretability in approximative fuzzy models via multi-objective evolutionary algorithms. Gómez-Skarmeta, A.F. University of Murcia skarmeta@dif.um.es Jiménez, F. University of Murcia fernan@dif.um.es

More information

CHAPTER 2 MULTI-OBJECTIVE REACTIVE POWER OPTIMIZATION

CHAPTER 2 MULTI-OBJECTIVE REACTIVE POWER OPTIMIZATION 19 CHAPTER 2 MULTI-OBJECTIE REACTIE POWER OPTIMIZATION 2.1 INTRODUCTION In this chapter, a fundamental knowledge of the Multi-Objective Optimization (MOO) problem and the methods to solve are presented.

More information

A multi-objective evolutionary algorithm for an effective tuning of fuzzy logic controllers in heating, ventilating and air conditioning systems

A multi-objective evolutionary algorithm for an effective tuning of fuzzy logic controllers in heating, ventilating and air conditioning systems Appl Intell (2012) 36:330 347 DOI 10.1007/s10489-010-0264-x A multi-objective evolutionary algorithm for an effective tuning of fuzzy logic controllers in heating, ventilating and air conditioning systems

More information

Association Pattern Mining. Lijun Zhang

Association Pattern Mining. Lijun Zhang Association Pattern Mining Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction The Frequent Pattern Mining Model Association Rule Generation Framework Frequent Itemset Mining Algorithms

More information

Evolutionary Computation

Evolutionary Computation Evolutionary Computation Lecture 9 Mul+- Objec+ve Evolu+onary Algorithms 1 Multi-objective optimization problem: minimize F(X) = ( f 1 (x),..., f m (x)) The objective functions may be conflicting or incommensurable.

More information

Survey of Multi-Objective Evolutionary Algorithms for Data Mining: Part-II

Survey of Multi-Objective Evolutionary Algorithms for Data Mining: Part-II 1 Survey of Multi-Objective Evolutionary Algorithms for Data Mining: Part-II Anirban Mukhopadhyay, Senior Member, IEEE, Ujjwal Maulik, Senior Member, IEEE, Sanghamitra Bandyopadhyay, Senior Member, IEEE,

More information

Mechanical Component Design for Multiple Objectives Using Elitist Non-Dominated Sorting GA

Mechanical Component Design for Multiple Objectives Using Elitist Non-Dominated Sorting GA Mechanical Component Design for Multiple Objectives Using Elitist Non-Dominated Sorting GA Kalyanmoy Deb, Amrit Pratap, and Subrajyoti Moitra Kanpur Genetic Algorithms Laboratory (KanGAL) Indian Institute

More information

Using an outward selective pressure for improving the search quality of the MOEA/D algorithm

Using an outward selective pressure for improving the search quality of the MOEA/D algorithm Comput Optim Appl (25) 6:57 67 DOI.7/s589-5-9733-9 Using an outward selective pressure for improving the search quality of the MOEA/D algorithm Krzysztof Michalak Received: 2 January 24 / Published online:

More information

Data Structures. Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali Association Rules: Basic Concepts and Application

Data Structures. Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali Association Rules: Basic Concepts and Application Data Structures Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali 2009-2010 Association Rules: Basic Concepts and Application 1. Association rules: Given a set of transactions, find

More information

An Evolutionary Algorithm for the Multi-objective Shortest Path Problem

An Evolutionary Algorithm for the Multi-objective Shortest Path Problem An Evolutionary Algorithm for the Multi-objective Shortest Path Problem Fangguo He Huan Qi Qiong Fan Institute of Systems Engineering, Huazhong University of Science & Technology, Wuhan 430074, P. R. China

More information

Value Added Association Rules

Value Added Association Rules Value Added Association Rules T.Y. Lin San Jose State University drlin@sjsu.edu Glossary Association Rule Mining A Association Rule Mining is an exploratory learning task to discover some hidden, dependency

More information

Approximation Model Guided Selection for Evolutionary Multiobjective Optimization

Approximation Model Guided Selection for Evolutionary Multiobjective Optimization Approximation Model Guided Selection for Evolutionary Multiobjective Optimization Aimin Zhou 1, Qingfu Zhang 2, and Guixu Zhang 1 1 Each China Normal University, Shanghai, China 2 University of Essex,

More information

Evolutionary multi-objective algorithm design issues

Evolutionary multi-objective algorithm design issues Evolutionary multi-objective algorithm design issues Karthik Sindhya, PhD Postdoctoral Researcher Industrial Optimization Group Department of Mathematical Information Technology Karthik.sindhya@jyu.fi

More information

IEEE TRANSACTIONS ON FUZZY SYSTEMS 1

IEEE TRANSACTIONS ON FUZZY SYSTEMS 1 IEEE TRANSACTIONS ON FUZZY SYSTEMS 1 A Fuzzy Association Rule-Based Classification Model for High-Dimensional Problems with Genetic Rule Selection and Lateral Tuning Jesús Alcalá-Fdez, Rafael Alcalá, and

More information

Unsupervised Feature Selection Using Multi-Objective Genetic Algorithms for Handwritten Word Recognition

Unsupervised Feature Selection Using Multi-Objective Genetic Algorithms for Handwritten Word Recognition Unsupervised Feature Selection Using Multi-Objective Genetic Algorithms for Handwritten Word Recognition M. Morita,2, R. Sabourin 3, F. Bortolozzi 3 and C. Y. Suen 2 École de Technologie Supérieure, Montreal,

More information

A genetic algorithms approach to optimization parameter space of Geant-V prototype

A genetic algorithms approach to optimization parameter space of Geant-V prototype A genetic algorithms approach to optimization parameter space of Geant-V prototype Oksana Shadura CERN, PH-SFT & National Technical Univ. of Ukraine Kyiv Polytechnic Institute Geant-V parameter space [1/2]

More information

CHAPTER 6 REAL-VALUED GENETIC ALGORITHMS

CHAPTER 6 REAL-VALUED GENETIC ALGORITHMS CHAPTER 6 REAL-VALUED GENETIC ALGORITHMS 6.1 Introduction Gradient-based algorithms have some weaknesses relative to engineering optimization. Specifically, it is difficult to use gradient-based algorithms

More information

Multi-Objective Sentiment Analysis Using Evolutionary Algorithm for Mining Positive & Negative Association Rules

Multi-Objective Sentiment Analysis Using Evolutionary Algorithm for Mining Positive & Negative Association Rules Multi-Objective Sentiment Analysis Using Evolutionary Algorithm for Mining Positive & Negative Association Rules Swati V. Gupta 1, Madhuri S. Joshi 2 1,2 Department of Computer Science & Engineering, MGM

More information

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R,

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R, Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R, statistics foundations 5 Introduction to D3, visual analytics

More information

EVOLUTIONARY algorithms (EAs) are a class of

EVOLUTIONARY algorithms (EAs) are a class of An Investigation on Evolutionary Gradient Search for Multi-objective Optimization C. K. Goh, Y. S. Ong and K. C. Tan Abstract Evolutionary gradient search is a hybrid algorithm that exploits the complementary

More information

Effectiveness and efficiency of non-dominated sorting for evolutionary multi- and many-objective optimization

Effectiveness and efficiency of non-dominated sorting for evolutionary multi- and many-objective optimization Complex Intell. Syst. (217) 3:247 263 DOI 1.17/s4747-17-57-5 ORIGINAL ARTICLE Effectiveness and efficiency of non-dominated sorting for evolutionary multi- and many-objective optimization Ye Tian 1 Handing

More information

Discovering Numeric Association Rules via Evolutionary Algorithm

Discovering Numeric Association Rules via Evolutionary Algorithm Discovering Numeric Association Rules via Evolutionary Algorithm Jacinto Mata 1,José-Luis Alvarez 1,andJosé-Cristobal Riquelme 2 1 Dpto. Ingeniería Electrónica, Sistemas Informáticos y Automática Universidad

More information

Optimization of Association Rule Mining through Genetic Algorithm

Optimization of Association Rule Mining through Genetic Algorithm Optimization of Association Rule Mining through Genetic Algorithm RUPALI HALDULAKAR School of Information Technology, Rajiv Gandhi Proudyogiki Vishwavidyalaya Bhopal, Madhya Pradesh India Prof. JITENDRA

More information

Introduction to ANSYS DesignXplorer

Introduction to ANSYS DesignXplorer Lecture 5 Goal Driven Optimization 14. 5 Release Introduction to ANSYS DesignXplorer 1 2013 ANSYS, Inc. September 27, 2013 Goal Driven Optimization (GDO) Goal Driven Optimization (GDO) is a multi objective

More information

A Similarity-Based Mating Scheme for Evolutionary Multiobjective Optimization

A Similarity-Based Mating Scheme for Evolutionary Multiobjective Optimization A Similarity-Based Mating Scheme for Evolutionary Multiobjective Optimization Hisao Ishibuchi and Youhei Shibata Department of Industrial Engineering, Osaka Prefecture University, - Gakuen-cho, Sakai,

More information

Interestingness Measurements

Interestingness Measurements Interestingness Measurements Objective measures Two popular measurements: support and confidence Subjective measures [Silberschatz & Tuzhilin, KDD95] A rule (pattern) is interesting if it is unexpected

More information

arxiv: v1 [cs.ai] 12 Feb 2017

arxiv: v1 [cs.ai] 12 Feb 2017 GENETIC AND MEMETIC ALGORITHM WITH DIVERSITY EQUILIBRIUM BASED ON GREEDY DIVERSIFICATION ANDRÉS HERRERA-POYATOS 1 AND FRANCISCO HERRERA 1,2 arxiv:1702.03594v1 [cs.ai] 12 Feb 2017 1 Research group Soft

More information

Efficient Non-domination Level Update Approach for Steady-State Evolutionary Multiobjective Optimization

Efficient Non-domination Level Update Approach for Steady-State Evolutionary Multiobjective Optimization Efficient Non-domination Level Update Approach for Steady-State Evolutionary Multiobjective Optimization Ke Li 1, Kalyanmoy Deb 1, Qingfu Zhang 2, and Sam Kwong 2 1 Department of Electrical and Computer

More information

M.Kannan et al IJCSET Feb 2011 Vol 1, Issue 1,30-34

M.Kannan et al IJCSET Feb 2011 Vol 1, Issue 1,30-34 Genetic Data Mining With Divide-And- Conquer Strategy M.Kannan, P.Yasodha, V.Srividhya CSA Dept., SCSVMV University, Enathur, Kanchipuram - 631 561. Abstract: Data mining is most commonly used in attempts

More information

Lamarckian Repair and Darwinian Repair in EMO Algorithms for Multiobjective 0/1 Knapsack Problems

Lamarckian Repair and Darwinian Repair in EMO Algorithms for Multiobjective 0/1 Knapsack Problems Repair and Repair in EMO Algorithms for Multiobjective 0/ Knapsack Problems Shiori Kaige, Kaname Narukawa, and Hisao Ishibuchi Department of Industrial Engineering, Osaka Prefecture University, - Gakuen-cho,

More information

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM 20 CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM 2.1 CLASSIFICATION OF CONVENTIONAL TECHNIQUES Classical optimization methods can be classified into two distinct groups:

More information

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL., NO., MONTH YEAR 1

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL., NO., MONTH YEAR 1 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL., NO., MONTH YEAR 1 An Efficient Approach to Non-dominated Sorting for Evolutionary Multi-objective Optimization Xingyi Zhang, Ye Tian, Ran Cheng, and

More information

Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems

Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems Tony Maciejewski, Kyle Tarplee, Ryan Friese, and Howard Jay Siegel Department of Electrical and Computer Engineering Colorado

More information

Performance Assessment of DMOEA-DD with CEC 2009 MOEA Competition Test Instances

Performance Assessment of DMOEA-DD with CEC 2009 MOEA Competition Test Instances Performance Assessment of DMOEA-DD with CEC 2009 MOEA Competition Test Instances Minzhong Liu, Xiufen Zou, Yu Chen, Zhijian Wu Abstract In this paper, the DMOEA-DD, which is an improvement of DMOEA[1,

More information

Heuristic Optimisation

Heuristic Optimisation Heuristic Optimisation Part 10: Genetic Algorithm Basics Sándor Zoltán Németh http://web.mat.bham.ac.uk/s.z.nemeth s.nemeth@bham.ac.uk University of Birmingham S Z Németh (s.nemeth@bham.ac.uk) Heuristic

More information

Recombination of Similar Parents in EMO Algorithms

Recombination of Similar Parents in EMO Algorithms H. Ishibuchi and K. Narukawa, Recombination of parents in EMO algorithms, Lecture Notes in Computer Science 341: Evolutionary Multi-Criterion Optimization, pp. 265-279, Springer, Berlin, March 25. (Proc.

More information

Multiobjective Formulations of Fuzzy Rule-Based Classification System Design

Multiobjective Formulations of Fuzzy Rule-Based Classification System Design Multiobjective Formulations of Fuzzy Rule-Based Classification System Design Hisao Ishibuchi and Yusuke Nojima Graduate School of Engineering, Osaka Prefecture University, - Gakuen-cho, Sakai, Osaka 599-853,

More information

Using ɛ-dominance for Hidden and Degenerated Pareto-Fronts

Using ɛ-dominance for Hidden and Degenerated Pareto-Fronts IEEE Symposium Series on Computational Intelligence Using ɛ-dominance for Hidden and Degenerated Pareto-Fronts Heiner Zille Institute of Knowledge and Language Engineering University of Magdeburg, Germany

More information

A Novel Approach to generate Bit-Vectors for mining Positive and Negative Association Rules

A Novel Approach to generate Bit-Vectors for mining Positive and Negative Association Rules A Novel Approach to generate Bit-Vectors for mining Positive and Negative Association Rules G. Mutyalamma 1, K. V Ramani 2, K.Amarendra 3 1 M.Tech Student, Department of Computer Science and Engineering,

More information

Multiobjective Prototype Optimization with Evolved Improvement Steps

Multiobjective Prototype Optimization with Evolved Improvement Steps Multiobjective Prototype Optimization with Evolved Improvement Steps Jiri Kubalik 1, Richard Mordinyi 2, and Stefan Biffl 3 1 Department of Cybernetics Czech Technical University in Prague Technicka 2,

More information

An Evolutionary Multi-Objective Crowding Algorithm (EMOCA): Benchmark Test Function Results

An Evolutionary Multi-Objective Crowding Algorithm (EMOCA): Benchmark Test Function Results Syracuse University SURFACE Electrical Engineering and Computer Science College of Engineering and Computer Science -0-005 An Evolutionary Multi-Objective Crowding Algorithm (EMOCA): Benchmark Test Function

More information

Computational Intelligence

Computational Intelligence Computational Intelligence Winter Term 2016/17 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund Slides prepared by Dr. Nicola Beume (2012) Multiobjective

More information

A Search Method with User s Preference Direction using Reference Lines

A Search Method with User s Preference Direction using Reference Lines A Search Method with User s Preference Direction using Reference Lines Tomohiro Yoshikawa Graduate School of Engineering, Nagoya University, Nagoya, Japan, {yoshikawa}@cse.nagoya-u.ac.jp Abstract Recently,

More information

Evolutionary Multi-objective Optimization of Business Process Designs with Pre-processing

Evolutionary Multi-objective Optimization of Business Process Designs with Pre-processing Evolutionary Multi-objective Optimization of Business Process Designs with Pre-processing Kostas Georgoulakos Department of Applied Informatics University of Macedonia Thessaloniki, Greece mai16027@uom.edu.gr

More information

Interestingness Measurements

Interestingness Measurements Interestingness Measurements Objective measures Two popular measurements: support and confidence Subjective measures [Silberschatz & Tuzhilin, KDD95] A rule (pattern) is interesting if it is unexpected

More information

Multi-Objective Memetic Algorithm using Pattern Search Filter Methods

Multi-Objective Memetic Algorithm using Pattern Search Filter Methods Multi-Objective Memetic Algorithm using Pattern Search Filter Methods F. Mendes V. Sousa M.F.P. Costa A. Gaspar-Cunha IPC/I3N - Institute of Polymers and Composites, University of Minho Guimarães, Portugal

More information

CHAPTER V ADAPTIVE ASSOCIATION RULE MINING ALGORITHM. Please purchase PDF Split-Merge on to remove this watermark.

CHAPTER V ADAPTIVE ASSOCIATION RULE MINING ALGORITHM. Please purchase PDF Split-Merge on   to remove this watermark. 119 CHAPTER V ADAPTIVE ASSOCIATION RULE MINING ALGORITHM 120 CHAPTER V ADAPTIVE ASSOCIATION RULE MINING ALGORITHM 5.1. INTRODUCTION Association rule mining, one of the most important and well researched

More information

METAHEURISTICS Genetic Algorithm

METAHEURISTICS Genetic Algorithm METAHEURISTICS Genetic Algorithm Jacques A. Ferland Department of Informatique and Recherche Opérationnelle Université de Montréal ferland@iro.umontreal.ca Genetic Algorithm (GA) Population based algorithm

More information

Data Mining Concepts

Data Mining Concepts Data Mining Concepts Outline Data Mining Data Warehousing Knowledge Discovery in Databases (KDD) Goals of Data Mining and Knowledge Discovery Association Rules Additional Data Mining Algorithms Sequential

More information

Fast Efficient Clustering Algorithm for Balanced Data

Fast Efficient Clustering Algorithm for Balanced Data Vol. 5, No. 6, 214 Fast Efficient Clustering Algorithm for Balanced Data Adel A. Sewisy Faculty of Computer and Information, Assiut University M. H. Marghny Faculty of Computer and Information, Assiut

More information

minimizing minimizing

minimizing minimizing The Pareto Envelope-based Selection Algorithm for Multiobjective Optimization David W. Corne, Joshua D. Knowles, Martin J. Oates School of Computer Science, Cybernetics and Electronic Engineering University

More information

Genetic Algorithms Variations and Implementation Issues

Genetic Algorithms Variations and Implementation Issues Genetic Algorithms Variations and Implementation Issues CS 431 Advanced Topics in AI Classic Genetic Algorithms GAs as proposed by Holland had the following properties: Randomly generated population Binary

More information

CMPUT 391 Database Management Systems. Data Mining. Textbook: Chapter (without 17.10)

CMPUT 391 Database Management Systems. Data Mining. Textbook: Chapter (without 17.10) CMPUT 391 Database Management Systems Data Mining Textbook: Chapter 17.7-17.11 (without 17.10) University of Alberta 1 Overview Motivation KDD and Data Mining Association Rules Clustering Classification

More information

Chapter 4: Mining Frequent Patterns, Associations and Correlations

Chapter 4: Mining Frequent Patterns, Associations and Correlations Chapter 4: Mining Frequent Patterns, Associations and Correlations 4.1 Basic Concepts 4.2 Frequent Itemset Mining Methods 4.3 Which Patterns Are Interesting? Pattern Evaluation Methods 4.4 Summary Frequent

More information

Multi-Objective Pipe Smoothing Genetic Algorithm For Water Distribution Network Design

Multi-Objective Pipe Smoothing Genetic Algorithm For Water Distribution Network Design City University of New York (CUNY) CUNY Academic Works International Conference on Hydroinformatics 8-1-2014 Multi-Objective Pipe Smoothing Genetic Algorithm For Water Distribution Network Design Matthew

More information

A Parallel Evolutionary Algorithm for Discovery of Decision Rules

A Parallel Evolutionary Algorithm for Discovery of Decision Rules A Parallel Evolutionary Algorithm for Discovery of Decision Rules Wojciech Kwedlo Faculty of Computer Science Technical University of Bia lystok Wiejska 45a, 15-351 Bia lystok, Poland wkwedlo@ii.pb.bialystok.pl

More information

Chapter 4: Association analysis:

Chapter 4: Association analysis: Chapter 4: Association analysis: 4.1 Introduction: Many business enterprises accumulate large quantities of data from their day-to-day operations, huge amounts of customer purchase data are collected daily

More information

Multi-Objective Optimization using Evolutionary Algorithms

Multi-Objective Optimization using Evolutionary Algorithms Multi-Objective Optimization using Evolutionary Algorithms Kalyanmoy Deb Department of Mechanical Engineering, Indian Institute of Technology, Kanpur, India JOHN WILEY & SONS, LTD Chichester New York Weinheim

More information

2 CONTENTS

2 CONTENTS Contents 5 Mining Frequent Patterns, Associations, and Correlations 3 5.1 Basic Concepts and a Road Map..................................... 3 5.1.1 Market Basket Analysis: A Motivating Example........................

More information

Analysis of Measures of Quantitative Association Rules

Analysis of Measures of Quantitative Association Rules Analysis of Measures of Quantitative Association Rules M. Martínez-Ballesteros and J.C. Riquelme Department of Computer Science, University of Seville, Spain {mariamartinez,riquelme}@us.es Abstract. This

More information

Improved S-CDAS using Crossover Controlling the Number of Crossed Genes for Many-objective Optimization

Improved S-CDAS using Crossover Controlling the Number of Crossed Genes for Many-objective Optimization Improved S-CDAS using Crossover Controlling the Number of Crossed Genes for Many-objective Optimization Hiroyuki Sato Faculty of Informatics and Engineering, The University of Electro-Communications -5-

More information

The Genetic Algorithm for finding the maxima of single-variable functions

The Genetic Algorithm for finding the maxima of single-variable functions Research Inventy: International Journal Of Engineering And Science Vol.4, Issue 3(March 2014), PP 46-54 Issn (e): 2278-4721, Issn (p):2319-6483, www.researchinventy.com The Genetic Algorithm for finding

More information

MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS FOR ENERGY-EFFICIENCY IN HETEROGENEOUS WIRELESS SENSOR NETWORKS

MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS FOR ENERGY-EFFICIENCY IN HETEROGENEOUS WIRELESS SENSOR NETWORKS MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS FOR ENERGY-EFFICIENCY IN HETEROGENEOUS WIRELESS SENSOR NETWORKS José M. Lanza-Gutiérrez, Juan A. Gómez-Pulido, Miguel A. Vega- Rodríguez, Juan M. Sánchez University

More information

INTERACTIVE MULTI-OBJECTIVE GENETIC ALGORITHMS FOR THE BUS DRIVER SCHEDULING PROBLEM

INTERACTIVE MULTI-OBJECTIVE GENETIC ALGORITHMS FOR THE BUS DRIVER SCHEDULING PROBLEM Advanced OR and AI Methods in Transportation INTERACTIVE MULTI-OBJECTIVE GENETIC ALGORITHMS FOR THE BUS DRIVER SCHEDULING PROBLEM Jorge PINHO DE SOUSA 1, Teresa GALVÃO DIAS 1, João FALCÃO E CUNHA 1 Abstract.

More information

Approximation-Guided Evolutionary Multi-Objective Optimization

Approximation-Guided Evolutionary Multi-Objective Optimization Approximation-Guided Evolutionary Multi-Objective Optimization Karl Bringmann 1, Tobias Friedrich 1, Frank Neumann 2, Markus Wagner 2 1 Max-Planck-Institut für Informatik, Campus E1.4, 66123 Saarbrücken,

More information

Association Rules. Berlin Chen References:

Association Rules. Berlin Chen References: Association Rules Berlin Chen 2005 References: 1. Data Mining: Concepts, Models, Methods and Algorithms, Chapter 8 2. Data Mining: Concepts and Techniques, Chapter 6 Association Rules: Basic Concepts A

More information

REAL-CODED GENETIC ALGORITHMS CONSTRAINED OPTIMIZATION. Nedim TUTKUN

REAL-CODED GENETIC ALGORITHMS CONSTRAINED OPTIMIZATION. Nedim TUTKUN REAL-CODED GENETIC ALGORITHMS CONSTRAINED OPTIMIZATION Nedim TUTKUN nedimtutkun@gmail.com Outlines Unconstrained Optimization Ackley s Function GA Approach for Ackley s Function Nonlinear Programming Penalty

More information

The k-means Algorithm and Genetic Algorithm

The k-means Algorithm and Genetic Algorithm The k-means Algorithm and Genetic Algorithm k-means algorithm Genetic algorithm Rough set approach Fuzzy set approaches Chapter 8 2 The K-Means Algorithm The K-Means algorithm is a simple yet effective

More information

Genetic Algorithms. Kang Zheng Karl Schober

Genetic Algorithms. Kang Zheng Karl Schober Genetic Algorithms Kang Zheng Karl Schober Genetic algorithm What is Genetic algorithm? A genetic algorithm (or GA) is a search technique used in computing to find true or approximate solutions to optimization

More information

Data mining, 4 cu Lecture 6:

Data mining, 4 cu Lecture 6: 582364 Data mining, 4 cu Lecture 6: Quantitative association rules Multi-level association rules Spring 2010 Lecturer: Juho Rousu Teaching assistant: Taru Itäpelto Data mining, Spring 2010 (Slides adapted

More information

Multi-Objective Optimization using Evolutionary Algorithms

Multi-Objective Optimization using Evolutionary Algorithms Multi-Objective Optimization using Evolutionary Algorithms Kalyanmoy Deb Department ofmechanical Engineering, Indian Institute of Technology, Kanpur, India JOHN WILEY & SONS, LTD Chichester New York Weinheim

More information

Association Rule Mining: FP-Growth

Association Rule Mining: FP-Growth Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong We have already learned the Apriori algorithm for association rule mining. In this lecture, we will discuss a faster

More information

Chapter 2 Some Single- and Multiobjective Optimization Techniques 2.1 Introduction

Chapter 2 Some Single- and Multiobjective Optimization Techniques 2.1 Introduction Chapter 2 Some Single- and Multiobjective Optimization Techniques 2.1 Introduction Optimization deals with the study of those kinds of problems in which one has to minimize or maximize one or more objectives

More information

Assessing the Convergence Properties of NSGA-II for Direct Crashworthiness Optimization

Assessing the Convergence Properties of NSGA-II for Direct Crashworthiness Optimization 10 th International LS-DYNA Users Conference Opitmization (1) Assessing the Convergence Properties of NSGA-II for Direct Crashworthiness Optimization Guangye Li 1, Tushar Goel 2, Nielen Stander 2 1 IBM

More information

Mechanical Component Design for Multiple Objectives Using Elitist Non-Dominated Sorting GA

Mechanical Component Design for Multiple Objectives Using Elitist Non-Dominated Sorting GA Mechanical Component Design for Multiple Objectives Using Elitist Non-Dominated Sorting GA Kalyanmoy Deb, Amrit Pratap, and Subrajyoti Moitra Kanpur Genetic Algorithms Laboratory (KanGAL) Indian Institute

More information

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra Pattern Recall Analysis of the Hopfield Neural Network with a Genetic Algorithm Susmita Mohapatra Department of Computer Science, Utkal University, India Abstract: This paper is focused on the implementation

More information

Chapter 6: Basic Concepts: Association Rules. Basic Concepts: Frequent Patterns. (absolute) support, or, support. (relative) support, s, is the

Chapter 6: Basic Concepts: Association Rules. Basic Concepts: Frequent Patterns. (absolute) support, or, support. (relative) support, s, is the Chapter 6: What Is Frequent ent Pattern Analysis? Frequent pattern: a pattern (a set of items, subsequences, substructures, etc) that occurs frequently in a data set frequent itemsets and association rule

More information

Research Article Path Planning Using a Hybrid Evolutionary Algorithm Based on Tree Structure Encoding

Research Article Path Planning Using a Hybrid Evolutionary Algorithm Based on Tree Structure Encoding e Scientific World Journal, Article ID 746260, 8 pages http://dx.doi.org/10.1155/2014/746260 Research Article Path Planning Using a Hybrid Evolutionary Algorithm Based on Tree Structure Encoding Ming-Yi

More information

INFREQUENT WEIGHTED ITEM SET MINING USING NODE SET BASED ALGORITHM

INFREQUENT WEIGHTED ITEM SET MINING USING NODE SET BASED ALGORITHM INFREQUENT WEIGHTED ITEM SET MINING USING NODE SET BASED ALGORITHM G.Amlu #1 S.Chandralekha #2 and PraveenKumar *1 # B.Tech, Information Technology, Anand Institute of Higher Technology, Chennai, India

More information

Pseudo-code for typical EA

Pseudo-code for typical EA Extra Slides for lectures 1-3: Introduction to Evolutionary algorithms etc. The things in slides were more or less presented during the lectures, combined by TM from: A.E. Eiben and J.E. Smith, Introduction

More information

Data Mining Part 3. Associations Rules

Data Mining Part 3. Associations Rules Data Mining Part 3. Associations Rules 3.2 Efficient Frequent Itemset Mining Methods Fall 2009 Instructor: Dr. Masoud Yaghini Outline Apriori Algorithm Generating Association Rules from Frequent Itemsets

More information

AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE

AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE Vandit Agarwal 1, Mandhani Kushal 2 and Preetham Kumar 3

More information

Chapter 28. Outline. Definitions of Data Mining. Data Mining Concepts

Chapter 28. Outline. Definitions of Data Mining. Data Mining Concepts Chapter 28 Data Mining Concepts Outline Data Mining Data Warehousing Knowledge Discovery in Databases (KDD) Goals of Data Mining and Knowledge Discovery Association Rules Additional Data Mining Algorithms

More information

THIS PAPER proposes a hybrid decoding to apply with

THIS PAPER proposes a hybrid decoding to apply with Proceedings of the 01 Federated Conference on Computer Science and Information Systems pp. 9 0 Biased Random Key Genetic Algorithm with Hybrid Decoding for Multi-objective Optimization Panwadee Tangpattanakul

More information

CS348 FS Solving NP-Complete Light Up Puzzle

CS348 FS Solving NP-Complete Light Up Puzzle CS348 FS2013 - Solving NP-Complete Light Up Puzzle Daniel Tauritz, Ph.D. October 7, 2013 Synopsis The goal of this assignment set is for you to become familiarized with (I) representing problems in mathematically

More information

Multiobjective Optimization

Multiobjective Optimization Multiobjective Optimization Concepts, Algorithms and Performance Measures Joshua Knowles School of Computer Science The University of Manchester COMP60342 - Week 5 2.15, 2 May 2014 Introducing Multiobjective

More information

Classification of Concept-Drifting Data Streams using Optimized Genetic Algorithm

Classification of Concept-Drifting Data Streams using Optimized Genetic Algorithm Classification of Concept-Drifting Data Streams using Optimized Genetic Algorithm E. Padmalatha Asst.prof CBIT C.R.K. Reddy, PhD Professor CBIT B. Padmaja Rani, PhD Professor JNTUH ABSTRACT Data Stream

More information

Discovering Knowledge Rules with Multi-Objective Evolutionary Computing

Discovering Knowledge Rules with Multi-Objective Evolutionary Computing 2010 Ninth International Conference on Machine Learning and Applications Discovering Knowledge Rules with Multi-Objective Evolutionary Computing Rafael Giusti, Gustavo E. A. P. A. Batista Instituto de

More information

Multiobjective Optimization Using Adaptive Pareto Archived Evolution Strategy

Multiobjective Optimization Using Adaptive Pareto Archived Evolution Strategy Multiobjective Optimization Using Adaptive Pareto Archived Evolution Strategy Mihai Oltean Babeş-Bolyai University Department of Computer Science Kogalniceanu 1, Cluj-Napoca, 3400, Romania moltean@cs.ubbcluj.ro

More information

Comparison of FP tree and Apriori Algorithm

Comparison of FP tree and Apriori Algorithm International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 10, Issue 6 (June 2014), PP.78-82 Comparison of FP tree and Apriori Algorithm Prashasti

More information

Evolutionary Algorithm for Embedded System Topology Optimization. Supervisor: Prof. Dr. Martin Radetzki Author: Haowei Wang

Evolutionary Algorithm for Embedded System Topology Optimization. Supervisor: Prof. Dr. Martin Radetzki Author: Haowei Wang Evolutionary Algorithm for Embedded System Topology Optimization Supervisor: Prof. Dr. Martin Radetzki Author: Haowei Wang Agenda Introduction to the problem Principle of evolutionary algorithm Model specification

More information

DEMO: Differential Evolution for Multiobjective Optimization

DEMO: Differential Evolution for Multiobjective Optimization DEMO: Differential Evolution for Multiobjective Optimization Tea Robič and Bogdan Filipič Department of Intelligent Systems, Jožef Stefan Institute, Jamova 39, SI-1000 Ljubljana, Slovenia tea.robic@ijs.si

More information

An Interactive Evolutionary Multi-Objective Optimization Method Based on Progressively Approximated Value Functions

An Interactive Evolutionary Multi-Objective Optimization Method Based on Progressively Approximated Value Functions An Interactive Evolutionary Multi-Objective Optimization Method Based on Progressively Approximated Value Functions Kalyanmoy Deb, Ankur Sinha, Pekka Korhonen, and Jyrki Wallenius KanGAL Report Number

More information