Evolving SQL Queries for Data Mining

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Evolving SQL Queries for Data Mining"

Transcription

1 Evolving SQL Queries for Data Mining Majid Salim and Xin Yao School of Computer Science, The University of Birmingham Edgbaston, Birmingham B15 2TT, UK Abstract. This paper presents a methodology for applying the principles of evolutionary computation to knowledge discovery in databases by evolving SQL queries that describe datasets. In our system, the fittest queries are rewarded by having their attributes being given a higher probability of surviving in subsequent queries. The advantages of using SQL queries include their readability for non-experts and ease of integration with existing databases. The evolutionary algorithm (EA) used in our system is very different from existing EAs, but seems to be effective and efficient according to the experiments to date with three different testing data sets. 1 Introduction Data mining studies the identification and extraction of useful knowledge from large amounts of data [5]. There are a number of different fields of inquiry within data mining, of which classification is particularly popular. Machine learning algorithms that can learn to classify datum correctly can be applied to a wide variety of problem domains, including credit card fraud detection and medical diagnostics [1,2,3]. An important aspect of such algorithms is ensuring that they are easy to comprehend, to facilitate the transfer of machine discovered knowledge to people easily [4]. This paper will present a framework for discovering classification knowledge hidden in a database through evolutionary computation techniques, as applied to SQL queries. The task is related to but different from the conventional classification problem. Instead of trying to learn a classifier for predicting an unseen example, we are most interested in discovering the underlying knowledge and concept that best describes a given set of data from a large database. SQL is a standardised data manipulation language that is widely supported by database vendors. Constructing a data mining framework using SQL is therefore very useful, as it would inherit SQL s portability and readability. Ryu and Eick [7] proposed a genetic programming (GP) based approach to deriving queries from examples. However, there are two major differences between the work presented here and theirs. First, the query languages used are different and, as a result, the chromosome representations are different. Our use of SQL has made the whole system much simpler and more portable. Second, the evolutionary algorithms used are different. While Ryu and Eick [7] used GP, we H. Yin et al. (Eds.): IDEAL 2002, LNCS 2412, pp , c Springer-Verlag Berlin Heidelberg 2002

2 Evolving SQL Queries for Data Mining 63 have developed a much simpler algorithm which does not use any conventional crossover and mutation operators. Instead, the idea of self-adaptation at the gene level is exploited. Our initial experimental studies have shown that such a simple scheme is very easy to implement, yet very effective and efficient. The rest of this paper is structured as follows. Section 2 describes the architecture of the proposed framework, justifying design decisions made and explaining the benefits and drawbacks that were perceived in the process. Section 3 presents initial results obtained with the framework, and Section 4 concludes the paper with a brief discussion of future work that is planned. 2 Evolving SQL Queries It was necessary to find a way of representing SQL queries genotypically, to allow for the application of evolutionary search operators. Another issue was the design of a fitness function to apply evolutionary pressure to the queries, to guide them towards the correct classification rules. Genotypes were required to encode the list of conditional constraints that specify the criterion by which records should be selected. Each conditional constraint in SQL follows the structure [attribute name] [logical operator] [value]. This sequence was chosen as the basic unit of information, or gene, from which genotypes would be constructed. Genotypic representations varied randomly in length. 2.1 Evolutionary Search The algorithm that was implemented is described in this section. 100 genotypes were constructed by randomly selecting attribute names, logical operators and values. Each attribute in the dataset began with a 0.5 probability of being included in any given genotype. Genotypes were then translated into SQL by initialising a String with the value SELECT * FROM [tablename] WHERE, and then appending each gene in the genotype to the end of the String. For example, a genotype such as this: (LEGS = 4) (PREDATOR = TRUE) (FEATHERS = FALSE) (VENOMOUS = FALSE) would be translated into the following SQL query, through the random addition of AND and OR conditionals: SELECT * FROM Animals WHERE LEGS = 4 AN D PREDATOR = true AND FEATHERS = false OR VENOMOUS = false Such SQL queries, once constructed, were sent to the database, and the results analysed. Each genotype was assigned a fitness value according to the extent to which its results corresponded with a target result set T. The fitness function used was

3 64 M. Salim and X. Yao fitness = falsepositives - (2 * falsenegatives), where 100 was an arbitrarily chosen constant. This fitness function was adapted from a paper by Ryu and Eick [7], dealing with deriving queries from object oriented databases. falsepositives is the number of records that were incorrectly identified as belonging to T, and falsenegatives is the number of records that should have been included T, but were not. The fitness function punishes false negatives more than it punishes false positives. If a query returns no false negatives, but several false positives, it can be seen to be correctly identifying the target result set, but generalising too much, whereas a query that returns false negatives is simply incorrect. By punishing false negatives more, it was hoped to apply evolutionary pressure that would favour queries that better classified the training data. After assigning fitness values for the 100 queries, the best and worst three were selected. If a perfect classifier was found (with fitness of 100) the evolution would terminate, otherwise the attributes would have their probabilities re-weighted. Every attribute that appeared in the top three fittest genotypes had its selection probability incremented by 1%. Every attribute in the worst three genotypes had its probability decremented by 1%. The old genotypes were then discarded, and a new set of 100 genotypes were randomly created using the self-adapted probabilities. Over a period of generations, attributes that contributed to higher fitness values came to dominate in the genotype set, whereas attributes that contributed little to a genotype featured less and less. 2.2 Discussions Our algorithm departs from the metaphor commonly used in evolutionary algorithms; however it does offer a mechanism through which the genotypes are iteratively converging on the sector of the search space that offers the greatest classification utility. Although genetic information of parents are not inherited directly by offspring, the genetic information in the whole population is inherited by the next population. Such inheritance is biased toward more useful genetic materials probabilistically. Hence, more useful genetic materials will occur more frequently in a population. It is hoped that classification rules may be discovered as a consequence of this. 3 Experimental Studies Several experiments have been carried out to evaluate the effectiveness and efficiency of the proposed framework. All datasets were downloaded from the UCI Machine Learning Repository 1. Each dataset was tested with 20 independent runs. If after 100 generations a perfect classifier was not found, the best classifier found to date was returned. The results were averaged over the 20 runs, and are presented below. 1 mlearn/mlrepository.html

4 Evolving SQL Queries for Data Mining The Zoo Dataset The Zoo dataset contains data items that describe animals. In total 14 attributes are provided, of which 13 are boolean and one has a predefined integer range. The animals are classified into 7 different types. Table 1 describes the results from the Zoo dataset. ANG refers to the average number of generations that it took for our algorithm to find a perfect classifier. Table 1. Results for the Zoo dataset, showing performance of the evolved classifying queries for each animal type. The results were averaged over 20 runs. Type False Positives False Negatives ANG Accuracy % % n/a 83.3% % % % n/a 83.3% It can be seen that our algorithm performed well on most of the classification tasks. The two instances in which it failed to find perfect classifiers are the most difficult tasks within the dataset, as both tasks involve a very small set of animals. In both cases, however, the best queries did not include false negatives. 3.2 Monk s Problems The Monks Problem dataset involves data items with six attributes, all of which are predefined integers between 1 and 4. The first Monk s problem is the identification of data patterns where (B=C) or (E=1). The second problem is the identification of all data patterns that feature exactly two of (B = 1, C = 1, D =1, E = 1, F = 1 or G = 1). The third Monk s problem is the identification of data patterns where (F = 3 and E = 1) or ( F!= 4 and C!= 3), and features 5% noise added to the training set. The results averaged over 20 runs are summarised in Table 2. Our algorithm performed perfectly on the first problem, and very well on the third, but performed poorly on the second problem. Part of the reason lies in SQL s inherent difficulty in expressing the desired conditions. The second Monks Problem requires a solution that compares relative attribute values, whereas SQL is usually used to select records according to a set of disjunctive attribute constraints.

5 66 M. Salim and X. Yao Table 2. Results for Monks Problem datasets, showing performance of the best queries for each problem. ANG refers to the average number of generations that it took for our algorithm to find a perfect classifier. Type False Positives False Negatives ANG Accuracy Problem % Problem n/a 16.9% Problem n/a 94.7% 3.3 Credit Card Approval The credit card approval dataset contains anonymised information on credit card application approvals and rejections. The dataset contains a variety of attribute types, with some attributes having predefined values and others having continuous values. The dataset also features 5% noise. Our algorithm succeeded in correctly identifying, on average, 82.9% of the rejections. However, this relative success is countered by the fact that this classifier also included a large number of false positives on average, accounting for nearly 20% of the dataset size. 3.4 Discussion of the Results The results for the Zoo and Monk s Problem datasets are encouraging. Our algorithm demonstrates the poorest performance on the second Monk s problem, which may be because the problem is not structurally conducive to an SQL based classification rule, although future refinements of our algorithm will hopefully improve upon these results. The results with the credit card approval dataset also show room for improvement. This may be due to its inclusion of continuous variables. Our algorithm performs poorly with continuous valued attributes because, although it can identify attributes that are valuable in making a classification, it cannot make the same distinction for logical operators or values. It is necessary for the algorithm to find the variable values as well as attribute values that are necessary for good classification. It is proposed that logical operators will be given initial selection probabilities as well, which will decrement or increment according to the effect they play upon the fitness value of their genotype. 4 Conclusions By using evolutionary computation techniques to evolve SQL queries it is possible to create a data mining framework that both produces easily readable results, and also can be applied to any SQL compliant database system. The problem considered here is somewhat different from the conventional classification problem. The key question we are addressing here is: Given a subset of data in a

6 Evolving SQL Queries for Data Mining 67 large database, how can we gain a better understanding of them? Our solution is to evolve human comprehensible SQL queries that describe the data. The algorithm proposed in this paper differs from many traditional evolutionary algorithms, in that it does not use the metaphor of selection, whereby the fittest individuals have their traits inherited by the new generation of individuals, through operations such as crossover or mutation. Rather, it rewards the attributes that make individuals successful, and then iterates the initial step of creation. In other words, rather than survival of the fittest, this work operates upon the principle of survival of the qualities that make the fittest fit. Although many genetic algorithms feature mutation, it is usually scaled down so that it does not destroy any useful structures that evolution may have already constructed. This approach differs in that it divorces the importance of the attribute from the values that the attribute happens to have in a given gene. As such it effects an evolutionary liquidity that in turn results in an appealingly diverse population, more likely to distribute itself over an entire search space than it is to converge on some local optima. Although our preliminary experimental results are promising, they also offer room for improvement. It is hoped that future improvements with regard to dealing with continuous variables will improve performance. References 1. X. Yao and Y. Liu, A new evolutionary system for evolving artificial neural networks, IEEE Transactions on Neural Networks, 8(3): , May X. Yao and Y. Liu, Making use of population information in evolutionary artificial neural networks, IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics, 28(3): , June Y. Liu, X. Yao and T. Higuchi, Evolutionary ensembles with negative correlation learning, IEEE Transactions on Evolutionary Computation, 4(4): , November J. Bobbin and X. Yao, Evolving rules for nonlinear control, In New Frontier in Computational Intelligence and its Applications, M. Mohammadian (ed.), IOS Press, Amsterdam, 2000, pp A. A. Freitas, A genetic programming framework for two data mining tasks: classification and knowledge discovery, Genetic Programming 1997: Proc. 2nd Annual Conference, pp , Stanford University, A. A. Freitas, A survey of evolutionary algorithms for data mining and knowledge discovery, In: A. Ghosh, S. Tsutsui (eds.), Advances in Evolutionary Computation, Springer-Verlag, T. W. Ryu, C. F. Eick, Deriving queries from results using genetic programming, Proc. 2nd International Conference, Knowledge Discovery and Data Mining, pp , AAAI Press, 1996

Preprocessing of Stream Data using Attribute Selection based on Survival of the Fittest

Preprocessing of Stream Data using Attribute Selection based on Survival of the Fittest Preprocessing of Stream Data using Attribute Selection based on Survival of the Fittest Bhakti V. Gavali 1, Prof. Vivekanand Reddy 2 1 Department of Computer Science and Engineering, Visvesvaraya Technological

More information

Using a genetic algorithm for editing k-nearest neighbor classifiers

Using a genetic algorithm for editing k-nearest neighbor classifiers Using a genetic algorithm for editing k-nearest neighbor classifiers R. Gil-Pita 1 and X. Yao 23 1 Teoría de la Señal y Comunicaciones, Universidad de Alcalá, Madrid (SPAIN) 2 Computer Sciences Department,

More information

Approach Using Genetic Algorithm for Intrusion Detection System

Approach Using Genetic Algorithm for Intrusion Detection System Approach Using Genetic Algorithm for Intrusion Detection System 544 Abhijeet Karve Government College of Engineering, Aurangabad, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, Maharashtra-

More information

Neural Network Weight Selection Using Genetic Algorithms

Neural Network Weight Selection Using Genetic Algorithms Neural Network Weight Selection Using Genetic Algorithms David Montana presented by: Carl Fink, Hongyi Chen, Jack Cheng, Xinglong Li, Bruce Lin, Chongjie Zhang April 12, 2005 1 Neural Networks Neural networks

More information

Genetic Algorithms. Kang Zheng Karl Schober

Genetic Algorithms. Kang Zheng Karl Schober Genetic Algorithms Kang Zheng Karl Schober Genetic algorithm What is Genetic algorithm? A genetic algorithm (or GA) is a search technique used in computing to find true or approximate solutions to optimization

More information

Constructing X-of-N Attributes with a Genetic Algorithm

Constructing X-of-N Attributes with a Genetic Algorithm Constructing X-of-N Attributes with a Genetic Algorithm Otavio Larsen 1 Alex Freitas 2 Julio C. Nievola 1 1 Postgraduate Program in Applied Computer Science 2 Computing Laboratory Pontificia Universidade

More information

CS5401 FS2015 Exam 1 Key

CS5401 FS2015 Exam 1 Key CS5401 FS2015 Exam 1 Key This is a closed-book, closed-notes exam. The only items you are allowed to use are writing implements. Mark each sheet of paper you use with your name and the string cs5401fs2015

More information

Time Complexity Analysis of the Genetic Algorithm Clustering Method

Time Complexity Analysis of the Genetic Algorithm Clustering Method Time Complexity Analysis of the Genetic Algorithm Clustering Method Z. M. NOPIAH, M. I. KHAIRIR, S. ABDULLAH, M. N. BAHARIN, and A. ARIFIN Department of Mechanical and Materials Engineering Universiti

More information

4/22/2014. Genetic Algorithms. Diwakar Yagyasen Department of Computer Science BBDNITM. Introduction

4/22/2014. Genetic Algorithms. Diwakar Yagyasen Department of Computer Science BBDNITM. Introduction 4/22/24 s Diwakar Yagyasen Department of Computer Science BBDNITM Visit dylycknow.weebly.com for detail 2 The basic purpose of a genetic algorithm () is to mimic Nature s evolutionary approach The algorithm

More information

Evolutionary Algorithms. CS Evolutionary Algorithms 1

Evolutionary Algorithms. CS Evolutionary Algorithms 1 Evolutionary Algorithms CS 478 - Evolutionary Algorithms 1 Evolutionary Computation/Algorithms Genetic Algorithms l Simulate natural evolution of structures via selection and reproduction, based on performance

More information

Introduction to Genetic Algorithms

Introduction to Genetic Algorithms Advanced Topics in Image Analysis and Machine Learning Introduction to Genetic Algorithms Week 3 Faculty of Information Science and Engineering Ritsumeikan University Today s class outline Genetic Algorithms

More information

Offspring Generation Method using Delaunay Triangulation for Real-Coded Genetic Algorithms

Offspring Generation Method using Delaunay Triangulation for Real-Coded Genetic Algorithms Offspring Generation Method using Delaunay Triangulation for Real-Coded Genetic Algorithms Hisashi Shimosaka 1, Tomoyuki Hiroyasu 2, and Mitsunori Miki 2 1 Graduate School of Engineering, Doshisha University,

More information

A Data Mining technique for Data Clustering based on Genetic Algorithm

A Data Mining technique for Data Clustering based on Genetic Algorithm Proceedings of the 6th WSEAS Int. Conf. on EVOLUTIONAR COMPUTING, Lisbon, Portugal, June 16-18, 2005 (pp269-274) A Data Mining technique for Data Clustering based on Genetic Algorithm J. Aguilar CEMISID.

More information

A Genetic Algorithm Approach for Clustering

A Genetic Algorithm Approach for Clustering www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 6 June, 2014 Page No. 6442-6447 A Genetic Algorithm Approach for Clustering Mamta Mor 1, Poonam Gupta

More information

A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2

A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2 Chapter 5 A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2 Graph Matching has attracted the exploration of applying new computing paradigms because of the large number of applications

More information

Hardware Neuronale Netzwerke - Lernen durch künstliche Evolution (?)

Hardware Neuronale Netzwerke - Lernen durch künstliche Evolution (?) SKIP - May 2004 Hardware Neuronale Netzwerke - Lernen durch künstliche Evolution (?) S. G. Hohmann, Electronic Vision(s), Kirchhoff Institut für Physik, Universität Heidelberg Hardware Neuronale Netzwerke

More information

Combinational Circuit Design Using Genetic Algorithms

Combinational Circuit Design Using Genetic Algorithms Combinational Circuit Design Using Genetic Algorithms Nithyananthan K Bannari Amman institute of technology M.E.Embedded systems, Anna University E-mail:nithyananthan.babu@gmail.com Abstract - In the paper

More information

A Combined Meta-Heuristic with Hyper-Heuristic Approach to Single Machine Production Scheduling Problem

A Combined Meta-Heuristic with Hyper-Heuristic Approach to Single Machine Production Scheduling Problem A Combined Meta-Heuristic with Hyper-Heuristic Approach to Single Machine Production Scheduling Problem C. E. Nugraheni, L. Abednego Abstract This paper is concerned with minimization of mean tardiness

More information

An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid

An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid Demin Wang 2, Hong Zhu 1, and Xin Liu 2 1 College of Computer Science and Technology, Jilin University, Changchun

More information

Multiobjective Optimization Using Adaptive Pareto Archived Evolution Strategy

Multiobjective Optimization Using Adaptive Pareto Archived Evolution Strategy Multiobjective Optimization Using Adaptive Pareto Archived Evolution Strategy Mihai Oltean Babeş-Bolyai University Department of Computer Science Kogalniceanu 1, Cluj-Napoca, 3400, Romania moltean@cs.ubbcluj.ro

More information

Enhancing K-means Clustering Algorithm with Improved Initial Center

Enhancing K-means Clustering Algorithm with Improved Initial Center Enhancing K-means Clustering Algorithm with Improved Initial Center Madhu Yedla #1, Srinivasa Rao Pathakota #2, T M Srinivasa #3 # Department of Computer Science and Engineering, National Institute of

More information

Outline. Motivation. Introduction of GAs. Genetic Algorithm 9/7/2017. Motivation Genetic algorithms An illustrative example Hypothesis space search

Outline. Motivation. Introduction of GAs. Genetic Algorithm 9/7/2017. Motivation Genetic algorithms An illustrative example Hypothesis space search Outline Genetic Algorithm Motivation Genetic algorithms An illustrative example Hypothesis space search Motivation Evolution is known to be a successful, robust method for adaptation within biological

More information

A Steady-State Genetic Algorithm for Traveling Salesman Problem with Pickup and Delivery

A Steady-State Genetic Algorithm for Traveling Salesman Problem with Pickup and Delivery A Steady-State Genetic Algorithm for Traveling Salesman Problem with Pickup and Delivery Monika Sharma 1, Deepak Sharma 2 1 Research Scholar Department of Computer Science and Engineering, NNSS SGI Samalkha,

More information

Artificial Intelligence Application (Genetic Algorithm)

Artificial Intelligence Application (Genetic Algorithm) Babylon University College of Information Technology Software Department Artificial Intelligence Application (Genetic Algorithm) By Dr. Asaad Sabah Hadi 2014-2015 EVOLUTIONARY ALGORITHM The main idea about

More information

Using Genetic Algorithms to Solve the Box Stacking Problem

Using Genetic Algorithms to Solve the Box Stacking Problem Using Genetic Algorithms to Solve the Box Stacking Problem Jenniffer Estrada, Kris Lee, Ryan Edgar October 7th, 2010 Abstract The box stacking or strip stacking problem is exceedingly difficult to solve

More information

An Introduction to Evolutionary Algorithms

An Introduction to Evolutionary Algorithms An Introduction to Evolutionary Algorithms Karthik Sindhya, PhD Postdoctoral Researcher Industrial Optimization Group Department of Mathematical Information Technology Karthik.sindhya@jyu.fi http://users.jyu.fi/~kasindhy/

More information

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset. Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied

More information

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM 1 CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM John R. Koza Computer Science Department Stanford University Stanford, California 94305 USA E-MAIL: Koza@Sunburn.Stanford.Edu

More information

A Web-Based Evolutionary Algorithm Demonstration using the Traveling Salesman Problem

A Web-Based Evolutionary Algorithm Demonstration using the Traveling Salesman Problem A Web-Based Evolutionary Algorithm Demonstration using the Traveling Salesman Problem Richard E. Mowe Department of Statistics St. Cloud State University mowe@stcloudstate.edu Bryant A. Julstrom Department

More information

JHPCSN: Volume 4, Number 1, 2012, pp. 1-7

JHPCSN: Volume 4, Number 1, 2012, pp. 1-7 JHPCSN: Volume 4, Number 1, 2012, pp. 1-7 QUERY OPTIMIZATION BY GENETIC ALGORITHM P. K. Butey 1, Shweta Meshram 2 & R. L. Sonolikar 3 1 Kamala Nehru Mahavidhyalay, Nagpur. 2 Prof. Priyadarshini Institute

More information

Association Rules Extraction using Multi-objective Feature of Genetic Algorithm

Association Rules Extraction using Multi-objective Feature of Genetic Algorithm Proceedings of the World Congress on Engineering and Computer Science 213 Vol II WCECS 213, 23-25 October, 213, San Francisco, USA Association Rules Extraction using Multi-objective Feature of Genetic

More information

Study on the Application Analysis and Future Development of Data Mining Technology

Study on the Application Analysis and Future Development of Data Mining Technology Study on the Application Analysis and Future Development of Data Mining Technology Ge ZHU 1, Feng LIN 2,* 1 Department of Information Science and Technology, Heilongjiang University, Harbin 150080, China

More information

Using Decision Boundary to Analyze Classifiers

Using Decision Boundary to Analyze Classifiers Using Decision Boundary to Analyze Classifiers Zhiyong Yan Congfu Xu College of Computer Science, Zhejiang University, Hangzhou, China yanzhiyong@zju.edu.cn Abstract In this paper we propose to use decision

More information

GENETIC ALGORITHM with Hands-On exercise

GENETIC ALGORITHM with Hands-On exercise GENETIC ALGORITHM with Hands-On exercise Adopted From Lecture by Michael Negnevitsky, Electrical Engineering & Computer Science University of Tasmania 1 Objective To understand the processes ie. GAs Basic

More information

Improving Classifier Performance by Imputing Missing Values using Discretization Method

Improving Classifier Performance by Imputing Missing Values using Discretization Method Improving Classifier Performance by Imputing Missing Values using Discretization Method E. CHANDRA BLESSIE Assistant Professor, Department of Computer Science, D.J.Academy for Managerial Excellence, Coimbatore,

More information

Towards Automatic Recognition of Fonts using Genetic Approach

Towards Automatic Recognition of Fonts using Genetic Approach Towards Automatic Recognition of Fonts using Genetic Approach M. SARFRAZ Department of Information and Computer Science King Fahd University of Petroleum and Minerals KFUPM # 1510, Dhahran 31261, Saudi

More information

Automated Test Data Generation and Optimization Scheme Using Genetic Algorithm

Automated Test Data Generation and Optimization Scheme Using Genetic Algorithm 2011 International Conference on Software and Computer Applications IPCSIT vol.9 (2011) (2011) IACSIT Press, Singapore Automated Test Data Generation and Optimization Scheme Using Genetic Algorithm Roshni

More information

Study on Classifiers using Genetic Algorithm and Class based Rules Generation

Study on Classifiers using Genetic Algorithm and Class based Rules Generation 2012 International Conference on Software and Computer Applications (ICSCA 2012) IPCSIT vol. 41 (2012) (2012) IACSIT Press, Singapore Study on Classifiers using Genetic Algorithm and Class based Rules

More information

Dynamic Clustering of Data with Modified K-Means Algorithm

Dynamic Clustering of Data with Modified K-Means Algorithm 2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq

More information

GENETIC ALGORITHM METHOD FOR COMPUTER AIDED QUALITY CONTROL

GENETIC ALGORITHM METHOD FOR COMPUTER AIDED QUALITY CONTROL 3 rd Research/Expert Conference with International Participations QUALITY 2003, Zenica, B&H, 13 and 14 November, 2003 GENETIC ALGORITHM METHOD FOR COMPUTER AIDED QUALITY CONTROL Miha Kovacic, Miran Brezocnik

More information

Genetic Algorithm and Simulated Annealing based Approaches to Categorical Data Clustering

Genetic Algorithm and Simulated Annealing based Approaches to Categorical Data Clustering Genetic Algorithm and Simulated Annealing based Approaches to Categorical Data Clustering Indrajit Saha and Anirban Mukhopadhyay Abstract Recently, categorical data clustering has been gaining significant

More information

MINIMAL EDGE-ORDERED SPANNING TREES USING A SELF-ADAPTING GENETIC ALGORITHM WITH MULTIPLE GENOMIC REPRESENTATIONS

MINIMAL EDGE-ORDERED SPANNING TREES USING A SELF-ADAPTING GENETIC ALGORITHM WITH MULTIPLE GENOMIC REPRESENTATIONS Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 5 th, 2006 MINIMAL EDGE-ORDERED SPANNING TREES USING A SELF-ADAPTING GENETIC ALGORITHM WITH MULTIPLE GENOMIC REPRESENTATIONS Richard

More information

Reducing Graphic Conflict In Scale Reduced Maps Using A Genetic Algorithm

Reducing Graphic Conflict In Scale Reduced Maps Using A Genetic Algorithm Reducing Graphic Conflict In Scale Reduced Maps Using A Genetic Algorithm Dr. Ian D. Wilson School of Technology, University of Glamorgan, Pontypridd CF37 1DL, UK Dr. J. Mark Ware School of Computing,

More information

Gen := 0. Create Initial Random Population. Termination Criterion Satisfied? Yes. Evaluate fitness of each individual in population.

Gen := 0. Create Initial Random Population. Termination Criterion Satisfied? Yes. Evaluate fitness of each individual in population. An Experimental Comparison of Genetic Programming and Inductive Logic Programming on Learning Recursive List Functions Lappoon R. Tang Mary Elaine Cali Raymond J. Mooney Department of Computer Sciences

More information

International Journal Of Engineering And Computer Science ISSN: Volume 5 Issue 11 Nov. 2016, Page No.

International Journal Of Engineering And Computer Science ISSN: Volume 5 Issue 11 Nov. 2016, Page No. www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 5 Issue 11 Nov. 2016, Page No. 19054-19062 Review on K-Mode Clustering Antara Prakash, Simran Kalera, Archisha

More information

Deriving Trading Rules Using Gene Expression Programming

Deriving Trading Rules Using Gene Expression Programming 22 Informatica Economică vol. 15, no. 1/2011 Deriving Trading Rules Using Gene Expression Programming Adrian VISOIU Academy of Economic Studies Bucharest - Romania Economic Informatics Department - collaborator

More information

CHAPTER 4 GENETIC ALGORITHM

CHAPTER 4 GENETIC ALGORITHM 69 CHAPTER 4 GENETIC ALGORITHM 4.1 INTRODUCTION Genetic Algorithms (GAs) were first proposed by John Holland (Holland 1975) whose ideas were applied and expanded on by Goldberg (Goldberg 1989). GAs is

More information

CHAPTER 4 FEATURE SELECTION USING GENETIC ALGORITHM

CHAPTER 4 FEATURE SELECTION USING GENETIC ALGORITHM CHAPTER 4 FEATURE SELECTION USING GENETIC ALGORITHM In this research work, Genetic Algorithm method is used for feature selection. The following section explains how Genetic Algorithm is used for feature

More information

AN EVOLUTIONARY APPROACH TO DISTANCE VECTOR ROUTING

AN EVOLUTIONARY APPROACH TO DISTANCE VECTOR ROUTING International Journal of Latest Research in Science and Technology Volume 3, Issue 3: Page No. 201-205, May-June 2014 http://www.mnkjournals.com/ijlrst.htm ISSN (Online):2278-5299 AN EVOLUTIONARY APPROACH

More information

Genetic programming. Lecture Genetic Programming. LISP as a GP language. LISP structure. S-expressions

Genetic programming. Lecture Genetic Programming. LISP as a GP language. LISP structure. S-expressions Genetic programming Lecture Genetic Programming CIS 412 Artificial Intelligence Umass, Dartmouth One of the central problems in computer science is how to make computers solve problems without being explicitly

More information

Segmentation of Noisy Binary Images Containing Circular and Elliptical Objects using Genetic Algorithms

Segmentation of Noisy Binary Images Containing Circular and Elliptical Objects using Genetic Algorithms Segmentation of Noisy Binary Images Containing Circular and Elliptical Objects using Genetic Algorithms B. D. Phulpagar Computer Engg. Dept. P. E. S. M. C. O. E., Pune, India. R. S. Bichkar Prof. ( Dept.

More information

Genetic Algorithm Performance with Different Selection Methods in Solving Multi-Objective Network Design Problem

Genetic Algorithm Performance with Different Selection Methods in Solving Multi-Objective Network Design Problem etic Algorithm Performance with Different Selection Methods in Solving Multi-Objective Network Design Problem R. O. Oladele Department of Computer Science University of Ilorin P.M.B. 1515, Ilorin, NIGERIA

More information

JEvolution: Evolutionary Algorithms in Java

JEvolution: Evolutionary Algorithms in Java Computational Intelligence, Simulation, and Mathematical Models Group CISMM-21-2002 May 19, 2015 JEvolution: Evolutionary Algorithms in Java Technical Report JEvolution V0.98 Helmut A. Mayer helmut@cosy.sbg.ac.at

More information

ABSTRACT I. INTRODUCTION. J Kanimozhi *, R Subramanian Department of Computer Science, Pondicherry University, Puducherry, Tamil Nadu, India

ABSTRACT I. INTRODUCTION. J Kanimozhi *, R Subramanian Department of Computer Science, Pondicherry University, Puducherry, Tamil Nadu, India ABSTRACT 2018 IJSRSET Volume 4 Issue 4 Print ISSN: 2395-1990 Online ISSN : 2394-4099 Themed Section : Engineering and Technology Travelling Salesman Problem Solved using Genetic Algorithm Combined Data

More information

Inductive Logic Programming in Clementine

Inductive Logic Programming in Clementine Inductive Logic Programming in Clementine Sam Brewer 1 and Tom Khabaza 2 Advanced Data Mining Group, SPSS (UK) Ltd 1st Floor, St. Andrew s House, West Street Woking, Surrey GU21 1EB, UK 1 sbrewer@spss.com,

More information

Genetic Algorithms for Classification and Feature Extraction

Genetic Algorithms for Classification and Feature Extraction Genetic Algorithms for Classification and Feature Extraction Min Pei, Erik D. Goodman, William F. Punch III and Ying Ding, (1995), Genetic Algorithms For Classification and Feature Extraction, Michigan

More information

A Generalized Feedforward Neural Network Architecture and Its Training Using Two Stochastic Search Methods

A Generalized Feedforward Neural Network Architecture and Its Training Using Two Stochastic Search Methods A Generalized Feedforward Neural Network Architecture and Its Training Using Two tochastic earch Methods Abdesselam Bouzerdoum 1 and Rainer Mueller 2 1 chool of Engineering and Mathematics Edith Cowan

More information

Introducing Partial Matching Approach in Association Rules for Better Treatment of Missing Values

Introducing Partial Matching Approach in Association Rules for Better Treatment of Missing Values Introducing Partial Matching Approach in Association Rules for Better Treatment of Missing Values SHARIQ BASHIR, SAAD RAZZAQ, UMER MAQBOOL, SONYA TAHIR, A. RAUF BAIG Department of Computer Science (Machine

More information

Enhancing Structure Discovery for Data Mining in Graphical Databases Using Evolutionary Programming

Enhancing Structure Discovery for Data Mining in Graphical Databases Using Evolutionary Programming From: FLAIRS-02 Proceedings. Copyright 2002, AAAI (www.aaai.org). All rights reserved. Enhancing Structure Discovery for Data Mining in Graphical Databases Using Evolutionary Programming Sanghamitra Bandyopadhyay,

More information

Application of Genetic Algorithms to CFD. Cameron McCartney

Application of Genetic Algorithms to CFD. Cameron McCartney Application of Genetic Algorithms to CFD Cameron McCartney Introduction define and describe genetic algorithms (GAs) and genetic programming (GP) propose possible applications of GA/GP to CFD Application

More information

Optimization Technique for Maximization Problem in Evolutionary Programming of Genetic Algorithm in Data Mining

Optimization Technique for Maximization Problem in Evolutionary Programming of Genetic Algorithm in Data Mining Optimization Technique for Maximization Problem in Evolutionary Programming of Genetic Algorithm in Data Mining R. Karthick Assistant Professor, Dept. of MCA Karpagam Institute of Technology karthick2885@yahoo.com

More information

The Role of Biomedical Dataset in Classification

The Role of Biomedical Dataset in Classification The Role of Biomedical Dataset in Classification Ajay Kumar Tanwani and Muddassar Farooq Next Generation Intelligent Networks Research Center (nexgin RC) National University of Computer & Emerging Sciences

More information

GENETIC ALGORITHM VERSUS PARTICLE SWARM OPTIMIZATION IN N-QUEEN PROBLEM

GENETIC ALGORITHM VERSUS PARTICLE SWARM OPTIMIZATION IN N-QUEEN PROBLEM Journal of Al-Nahrain University Vol.10(2), December, 2007, pp.172-177 Science GENETIC ALGORITHM VERSUS PARTICLE SWARM OPTIMIZATION IN N-QUEEN PROBLEM * Azhar W. Hammad, ** Dr. Ban N. Thannoon Al-Nahrain

More information

Association Rule Mining and Clustering

Association Rule Mining and Clustering Association Rule Mining and Clustering Lecture Outline: Classification vs. Association Rule Mining vs. Clustering Association Rule Mining Clustering Types of Clusters Clustering Algorithms Hierarchical:

More information

Evolving Variable-Ordering Heuristics for Constrained Optimisation

Evolving Variable-Ordering Heuristics for Constrained Optimisation Griffith Research Online https://research-repository.griffith.edu.au Evolving Variable-Ordering Heuristics for Constrained Optimisation Author Bain, Stuart, Thornton, John, Sattar, Abdul Published 2005

More information

Using Genetic Algorithms to Improve Pattern Classification Performance

Using Genetic Algorithms to Improve Pattern Classification Performance Using Genetic Algorithms to Improve Pattern Classification Performance Eric I. Chang and Richard P. Lippmann Lincoln Laboratory, MIT Lexington, MA 021739108 Abstract Genetic algorithms were used to select

More information

Random Search Report An objective look at random search performance for 4 problem sets

Random Search Report An objective look at random search performance for 4 problem sets Random Search Report An objective look at random search performance for 4 problem sets Dudon Wai Georgia Institute of Technology CS 7641: Machine Learning Atlanta, GA dwai3@gatech.edu Abstract: This report

More information

Swarm Based Fuzzy Clustering with Partition Validity

Swarm Based Fuzzy Clustering with Partition Validity Swarm Based Fuzzy Clustering with Partition Validity Lawrence O. Hall and Parag M. Kanade Computer Science & Engineering Dept University of South Florida, Tampa FL 33620 @csee.usf.edu Abstract

More information

A New Crossover Technique for Cartesian Genetic Programming

A New Crossover Technique for Cartesian Genetic Programming A New Crossover Technique for Cartesian Genetic Programming Genetic Programming Track Janet Clegg Intelligent Systems Group, Department of Electronics University of York, Heslington York,YODD,UK jc@ohm.york.ac.uk

More information

Machine Evolution. Machine Evolution. Let s look at. Machine Evolution. Machine Evolution. Machine Evolution. Machine Evolution

Machine Evolution. Machine Evolution. Let s look at. Machine Evolution. Machine Evolution. Machine Evolution. Machine Evolution Let s look at As you will see later in this course, neural networks can learn, that is, adapt to given constraints. For example, NNs can approximate a given function. In biology, such learning corresponds

More information

Genetic Algorithms and the Evolution of Neural Networks for Language Processing

Genetic Algorithms and the Evolution of Neural Networks for Language Processing Genetic Algorithms and the Evolution of Neural Networks for Language Processing Jaime J. Dávila Hampshire College, School of Cognitive Science Amherst, MA 01002 jdavila@hampshire.edu Abstract One approach

More information

Fuzzy Ant Clustering by Centroid Positioning

Fuzzy Ant Clustering by Centroid Positioning Fuzzy Ant Clustering by Centroid Positioning Parag M. Kanade and Lawrence O. Hall Computer Science & Engineering Dept University of South Florida, Tampa FL 33620 @csee.usf.edu Abstract We

More information

Internal vs. External Parameters in Fitness Functions

Internal vs. External Parameters in Fitness Functions Internal vs. External Parameters in Fitness Functions Pedro A. Diaz-Gomez Computing & Technology Department Cameron University Lawton, Oklahoma 73505, USA pdiaz-go@cameron.edu Dean F. Hougen School of

More information

Combining Two Local Searches with Crossover: An Efficient Hybrid Algorithm for the Traveling Salesman Problem

Combining Two Local Searches with Crossover: An Efficient Hybrid Algorithm for the Traveling Salesman Problem Combining Two Local Searches with Crossover: An Efficient Hybrid Algorithm for the Traveling Salesman Problem Weichen Liu, Thomas Weise, Yuezhong Wu and Qi Qi University of Science and Technology of Chine

More information

Concept Tree Based Clustering Visualization with Shaded Similarity Matrices

Concept Tree Based Clustering Visualization with Shaded Similarity Matrices Syracuse University SURFACE School of Information Studies: Faculty Scholarship School of Information Studies (ischool) 12-2002 Concept Tree Based Clustering Visualization with Shaded Similarity Matrices

More information

Evolution of Fuzzy Rule Based Classifiers

Evolution of Fuzzy Rule Based Classifiers Evolution of Fuzzy Rule Based Classifiers Jonatan Gomez Universidad Nacional de Colombia and The University of Memphis jgomezpe@unal.edu.co, jgomez@memphis.edu Abstract. The paper presents an evolutionary

More information

Sparse Matrices Reordering using Evolutionary Algorithms: A Seeded Approach

Sparse Matrices Reordering using Evolutionary Algorithms: A Seeded Approach 1 Sparse Matrices Reordering using Evolutionary Algorithms: A Seeded Approach David Greiner, Gustavo Montero, Gabriel Winter Institute of Intelligent Systems and Numerical Applications in Engineering (IUSIANI)

More information

Application of Genetic Algorithm Based Intuitionistic Fuzzy k-mode for Clustering Categorical Data

Application of Genetic Algorithm Based Intuitionistic Fuzzy k-mode for Clustering Categorical Data BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 4 Sofia 2017 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2017-0044 Application of Genetic Algorithm

More information

Hybrid Feature Selection for Modeling Intrusion Detection Systems

Hybrid Feature Selection for Modeling Intrusion Detection Systems Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,

More information

A genetic algorithm based focused Web crawler for automatic webpage classification

A genetic algorithm based focused Web crawler for automatic webpage classification A genetic algorithm based focused Web crawler for automatic webpage classification Nancy Goyal, Rajesh Bhatia, Manish Kumar Computer Science and Engineering, PEC University of Technology, Chandigarh, India

More information

Attribute Reduction using Forward Selection and Relative Reduct Algorithm

Attribute Reduction using Forward Selection and Relative Reduct Algorithm Attribute Reduction using Forward Selection and Relative Reduct Algorithm P.Kalyani Associate Professor in Computer Science, SNR Sons College, Coimbatore, India. ABSTRACT Attribute reduction of an information

More information

Data Mining Concepts

Data Mining Concepts Data Mining Concepts Outline Data Mining Data Warehousing Knowledge Discovery in Databases (KDD) Goals of Data Mining and Knowledge Discovery Association Rules Additional Data Mining Algorithms Sequential

More information

Effects of Three-Objective Genetic Rule Selection on the Generalization Ability of Fuzzy Rule-Based Systems

Effects of Three-Objective Genetic Rule Selection on the Generalization Ability of Fuzzy Rule-Based Systems Effects of Three-Objective Genetic Rule Selection on the Generalization Ability of Fuzzy Rule-Based Systems Hisao Ishibuchi and Takashi Yamamoto Department of Industrial Engineering, Osaka Prefecture University,

More information

An evolutionary annealing-simplex algorithm for global optimisation of water resource systems

An evolutionary annealing-simplex algorithm for global optimisation of water resource systems FIFTH INTERNATIONAL CONFERENCE ON HYDROINFORMATICS 1-5 July 2002, Cardiff, UK C05 - Evolutionary algorithms in hydroinformatics An evolutionary annealing-simplex algorithm for global optimisation of water

More information

Simulation of Back Propagation Neural Network for Iris Flower Classification

Simulation of Back Propagation Neural Network for Iris Flower Classification American Journal of Engineering Research (AJER) e-issn: 2320-0847 p-issn : 2320-0936 Volume-6, Issue-1, pp-200-205 www.ajer.org Research Paper Open Access Simulation of Back Propagation Neural Network

More information

Genetic Algorithm For Fingerprint Matching

Genetic Algorithm For Fingerprint Matching Genetic Algorithm For Fingerprint Matching B. POORNA Department Of Computer Applications, Dr.M.G.R.Educational And Research Institute, Maduravoyal, Chennai 600095,TamilNadu INDIA. Abstract:- An efficient

More information

The Establishment of Large Data Mining Platform Based on Cloud Computing. Wei CAI

The Establishment of Large Data Mining Platform Based on Cloud Computing. Wei CAI 2017 International Conference on Electronic, Control, Automation and Mechanical Engineering (ECAME 2017) ISBN: 978-1-60595-523-0 The Establishment of Large Data Mining Platform Based on Cloud Computing

More information

A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems

A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems Anestis Gkanogiannis and Theodore Kalamboukis Department of Informatics Athens University

More information

A GENETIC ALGORITHM FOR MOTION DETECTION

A GENETIC ALGORITHM FOR MOTION DETECTION A GENETIC ALGORITHM FOR MOTION DETECTION Jarosław Mamica, Tomasz Walkowiak Institute of Engineering Cybernetics, Wrocław University of Technology ul. Janiszewskiego 11/17, 50-37 Wrocław, POLAND, Phone:

More information

An Evolutionary Algorithm for the Multi-objective Shortest Path Problem

An Evolutionary Algorithm for the Multi-objective Shortest Path Problem An Evolutionary Algorithm for the Multi-objective Shortest Path Problem Fangguo He Huan Qi Qiong Fan Institute of Systems Engineering, Huazhong University of Science & Technology, Wuhan 430074, P. R. China

More information

Optimizing Flow Shop Sequencing Through Simulation Optimization Using Evolutionary Methods

Optimizing Flow Shop Sequencing Through Simulation Optimization Using Evolutionary Methods Optimizing Flow Shop Sequencing Through Simulation Optimization Using Evolutionary Methods Sucharith Vanguri 1, Travis W. Hill 2, Allen G. Greenwood 1 1 Department of Industrial Engineering 260 McCain

More information

Using Evolutionary Algorithms as Instance Selection for Data Reduction in KDD: An Experimental Study

Using Evolutionary Algorithms as Instance Selection for Data Reduction in KDD: An Experimental Study IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 7, NO. 6, DECEMBER 2003 561 Using Evolutionary Algorithms as Instance Selection for Data Reduction in KDD: An Experimental Study José Ramón Cano, Francisco

More information

A Descriptive Encoding Language for Evolving Modular Neural Networks

A Descriptive Encoding Language for Evolving Modular Neural Networks A Descriptive Encoding Language for Evolving Modular Neural Networks Jae-Yoon Jung and James A. Reggia Department of Computer Science, University of Maryland, College Park, MD 20742, USA {jung, reggia}@cs.umd.edu

More information

Genetic Algorithm for Finding Shortest Path in a Network

Genetic Algorithm for Finding Shortest Path in a Network Intern. J. Fuzzy Mathematical Archive Vol. 2, 2013, 43-48 ISSN: 2320 3242 (P), 2320 3250 (online) Published on 26 August 2013 www.researchmathsci.org International Journal of Genetic Algorithm for Finding

More information

Evolving Genotype to Phenotype Mappings with a Multiple-Chromosome Genetic Algorithm

Evolving Genotype to Phenotype Mappings with a Multiple-Chromosome Genetic Algorithm Evolving Genotype to Phenotype Mappings with a Multiple-Chromosome Genetic Algorithm Rick Chow Division of Mathematics and Computer Science University of South Carolina Spartanburg 800 University Way,

More information

USING IMAGES PATTERN RECOGNITION AND NEURAL NETWORKS FOR COATING QUALITY ASSESSMENT Image processing for quality assessment

USING IMAGES PATTERN RECOGNITION AND NEURAL NETWORKS FOR COATING QUALITY ASSESSMENT Image processing for quality assessment USING IMAGES PATTERN RECOGNITION AND NEURAL NETWORKS FOR COATING QUALITY ASSESSMENT Image processing for quality assessment L.-M. CHANG and Y.A. ABDELRAZIG School of Civil Engineering, Purdue University,

More information

Available online at ScienceDirect. Procedia Computer Science 35 (2014 )

Available online at  ScienceDirect. Procedia Computer Science 35 (2014 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 35 (2014 ) 388 396 18 th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems

More information

ANTICIPATORY VERSUS TRADITIONAL GENETIC ALGORITHM

ANTICIPATORY VERSUS TRADITIONAL GENETIC ALGORITHM Anticipatory Versus Traditional Genetic Algorithm ANTICIPATORY VERSUS TRADITIONAL GENETIC ALGORITHM ABSTRACT Irina Mocanu 1 Eugenia Kalisz 2 This paper evaluates the performances of a new type of genetic

More information

Genetic Algorithms for Vision and Pattern Recognition

Genetic Algorithms for Vision and Pattern Recognition Genetic Algorithms for Vision and Pattern Recognition Faiz Ul Wahab 11/8/2014 1 Objective To solve for optimization of computer vision problems using genetic algorithms 11/8/2014 2 Timeline Problem: Computer

More information

Detecting Spam with Artificial Neural Networks

Detecting Spam with Artificial Neural Networks Detecting Spam with Artificial Neural Networks Andrew Edstrom University of Wisconsin - Madison Abstract This is my final project for CS 539. In this project, I demonstrate the suitability of neural networks

More information