Data Mining using Ant Colony Optimization. Presentation Outline. Introduction. Thanks to: Johannes Singler, Bryan Atkinson
|
|
- Brian Dale Phelps
- 6 years ago
- Views:
Transcription
1 Data Mining using Ant Colony Optimization Thanks to: Johannes Singler, Bryan Atkinson Presentation Outline Introduction to Data Mining Rule Induction for Classification AntMiner Overview: Input/Output Rule Construction Quality Measurement Pheromone: Initial/Updating Experiments/Results Performance/Complexity Swarm-based Genetic Programming Introduction to GP, Symbolic Regression Crossover problems Ant Colony Crossover Experiments and Results Introduction Data Mining tries to find: hidden knowledge unexpected patterns new rules in large databases. Discovery of useful summaries of data Is a key element of much more elaborate process: Knowledge Discovery in Databases (KDD) 1
2 Goals of Rule Induction Stage of Data Mining: Rule Induction Find rules to describe data in some way Not only accurate but also comprehensible for a human user to support decision making Focus in this Talk Rule Induction for Classification using ACO Given: training set (instances/cases to classify) Goal: to come up with (preferably simple) rules to classify data Algorithm by Parpinelli, Lopes and Freitas: AntMiner ACO + Genetic Programming Symbolic regression Rule Induction Possible Outputs for Rule Induction decision trees (ordered) decision lists [here] if <attribute1>=<value1> and <attribute2>=<value2> and then <class>=<class1> else if 2
3 AntMiner Input Training set / test set Attribute / value pairs Given classes / classification AntMiner Output Ordered decision list Ordered list of IF-THEN-Rules like IF <condition> THEN <class> <condition> = <term1> AND <term2> AND <term> = <attribute> = <value> + Default rule (majority value) First rule fires. Only discrete attributes supported so far. Continuous values must be discretized before. This is a quite limited version of a decision list. Prerequisites for an ACO (Review) Problem-dependent heuristic function (η) for measuring the quality of items that could be added to the partial solution so far. Pheromone updating rule (τ) Probabilistic transition rule based on η and τ Difference to most ACO algorithms mentioned in class: Does not use a graph representation of the problem. 3
4 AntMiner Algorithm: Top-Level Pseudo-Code for finding one rule set: trainingset = {all training cases} discoveredrulelist = [ ] WHILE( trainingset still too big) Initialize pheromone (equally distributed) Ants try to find a good classification rule by the ACO heuristic Add best rule found to discoveredrulelist Remove correctly covered examples from trainingset AntMiner Algorithm: Mid-Level Pseudo-Code for finding one rule: Repeat Start new ant with empty rule (antecedent) Construct rule by adding one term at a time and choosing the rule consequent subsequently Prune rule Increase pheromone on trail which ant used according to the quality of the rule Until (maximum number z of ants exceeded) or (no improvement any more during the last k iterations) Actually only the population of one ant at a time working. AntMiner Algorithm: Bottom- Level Repeat as long as possible: Add one condition to the rule. Use probabilistic approach referring to pheromone concentration and heuristic. Do not use attributes twice. Resulting rule must cover at least a minimum of cases. After having finished the antecedent, calculate the resulting class. 4
5 Rule Construction Probability for adding <A i >=<V ij > P ij = " ij # ij (t) [normalized] where A i the i-th attribute V ij the j-th possible value of the i-th attribute η heuristic function, τ pheromone trail Heuristic Function (η) Analogous to: Proximity function in TSP Colouring matrix in graph colouring problem. Uses information theory (entropy). Split instances using rule. Quality corresponds to entropy of remaining buckets ; the less, the better. k H(W A j = V ij ) = "#(P(w A j = V ij ). log 2 P(w A j = V ij )) w=1 " ij # log 2 k $ H(W A j = V ij ) [normalized] where k is number of classes Information Heuristic Example For T, high = >80, mild = 70<T 80, cold = 0<T 70 (for later) P(play outlook=sunny)=2/14=0.143, P(don t play outlook=sunny)=3/14=0.214 H(W,outlook=sunny)= log(0.143) log(0.214)=0.877 η= log 2 k H(W,outlook=sunny) = =
6 Information Heuristic Example For H, high = >85, normal= 0<T 85, (for later) P(play outlook=overcast)=4/14=0.286, P(don t play outlook=overcast)=0/14=0 H(W,outlook=sunny)= log(0.286)=0.516 η= log 2 k H(W,outlook=sunny) = =0.484 Quality Function Measuring the classification quality of a rule / several rules. For one rule: sensitivity specificity TP Q = TP + FN. TN FP + TN where T=true, F=false, P=positive, N=negative The bigger the value of Q, the better Measuring the simplicity of a rule: number of rules average number of terms per rule The less, the simpler, thus the better. Rule Pruning Iteratively remove one-term-at-a-time from the rule while this process improves the classification accuracy of the rule. Majority class might change. If ambiguous, remove term that improves the accuracy the most. Simplicity improves anyway. 6
7 Pheromone Initial pheromone value: " ij (t = 0) = 1 [normalized] a b i # i=1 where a is the total number of attributes and b i is the number of possible values of A i. Pheromone Updating (τ) Values before (1). First increase pheromone of used terms regarding rule quality (2): " ij (t +1) = " ij (t).(1+ Q) Then normalize the pheromone level of all terms pheromone evaporation (3) Using the Discovered Rules Apply in the order they were discovered. First rule that covers case is applied. If no rule covers case, apply default result (majority value). 7
8 Possible Discretization of Continuous Attributes Use C4.5-Disc Quick overview: Extract reduced data set that only contains attribute to discretize and desired classification. From that build up decision tree using the C4.5 algorithm (another rule induction algorithm). Result: Decision tree with binary decisions x a go left; x > a go right Each path corresponds to the definition of a categorical interval. AntMiner s Parameters Number of ants (3000 used in experiments). Also limits the maximum number of rules found for a classification. Is not necessarily exploited because algorithm might converge before. Minimum number of cases per rule (10). Each rule must at least cover so many cases. Avoids overfitting. Maximum number of uncovered classes in the training set (10). The algorithm stops when there are only fewer instances left. Number of rules to test for the convergence of the ants (10). The algorithm waits so long for an improvement. Sample Run Start Deciding whether to play outside Attributes: outlook, temperature, humidity, windy, play Classes: play (yes), do not play (no) sunny,hot,high,false,no (1) sunny,hot,high,true,no (2) overcast,hot,normal,false,yes (3) rainy,mild,high,false,yes (4) rainy,cool,normal,false,yes (5) rainy,cool,normal,true,no (6) overcast,cool,normal,true,yes (7) sunny,mild,high,false,no (8) sunny,cool,normal,false,yes (9) rainy,mild,normal,false,yes (10) sunny,mild,normal,true,yes (11) overcast,mild,high,true,yes (12) overcast,hot,normal,false,yes (13) rainy,mild,high,true,no (14) Sample run for finding one rule set. Start: I={all}, R={} Ant 1: Choose probabilistically outlook=overcast (then play=yes) Ant 1: Chooses values for other attributes Ant 1: Finishes because all attributes are used. Ant 1: Last three conditions are pruned away. I={1,2,4,5,6,8,9,10,11,14}, R={outlook=overcast yes) Ant 2: Choose outlook=rainy (then play=yes) Rule is not good enough (3:2) Ant 2: Choose windy=true (then play=no) Ant 2 finishes because otherwise covered set would be too small. No pruning possible either. 8
9 Sample Run Result Possible result (not most simple): outlook=overcast play=yes outlook=rainy, windy=false play=yes outlook=sunny, humidity=normal play=yes otherwise play=no Comparison to CN2 Algorithm Uses beam search (limited breadth first search with beam width b). Add all possible terms to current partial rules, evaluate, and retain only the b best ones. No feedback for constructing new rules. Output format is the same (ordered rule list). Uses entropy heuristic as well. Experiment Setup Dimension roughly: cases, 9 34 attributes, 2 6 classes Tests run using a 10-fold cross-validation procedure Divide data into 10 partitions. For each partition do Treat it as the test data and use the other 90% as the training data. Measure the performance. Take the average value. This helps to achieve significant results. 9
10 DataSets Performance Results No particular parameter optimizations for both algorithms. Same computation time. Extensions to the Algorithm By Galea [3]. Deterministic rule with q probability as in ACS-TSP. Choose probabilistically (considering pheromone trail and heuristic function) with probability q. Otherwise deterministically choose term with maximum probability. Improves results slightly. Extension for fuzzy rules also possible. 10
11 Comparative Results Side-by-side Comparison Effects of Rule Pruning 11
12 Generated Rules Terms per Rule Algorithm Complexity Introducing a lot of variables n: number of cases a: number of attributes v: number of values per attribute; considered small; O(1) k: number of conditions per inspected rule while evaluating and pruning z: number of ants r: number of discovered rules 12
13 Complexity Comparison Ant-Miner, average case: Ant-Miner, worst case k = O(a): CN2: O(r.z.[k.a + n.k 3 ] + a.n) O(r.z.a 3.n) O(a(n + log(a))) Further Experiments Further experiments by the authors of AntMiner show that ACO really helps: Use of pheromone trails improves the average solution. Use of rule pruning improves the simplicity without harming the quality. References [1] Data Mining with an Ant Colony Optimization Algorithm. Parpinelli, Lopez, Freitas [2] An Ant Colony Based System for Data Mining: Applications To Medical. Parpinelli, Lopez, Freitas 2001 [3] Applying Swarm Intelligence to Rule Induction. Michelle Galea [4] The CN2 Induction Algorithm. Clark, Niblett [5] Data Mining. Adriaans, Zantinge. Addison-Wesley [6] Learning Fuzzy Rules Using Ant Colony Optimization Algorithms. Casillas, Cordón, Herrera [7] Bryan Atkinson Honours Project Report: n-atkinson-winter-2006.pdf 13
14 Ant-based Programming Genetic Programming has been successful at inducing program descriptions Problems with scaling: Diversity Retaining useful fragments: Avoiding disruption of higher order functions Can ACO help? Maybe, learn useful associations, avoid disruption Genetic Programming Programs represented in tree structure Learning through: Population-based, evolutionary search Genetic operators: crossover, mutation Requires specification of: Functions (F): internal nodes Terminals (T): leaf nodes Symbolic Regression: F = {+, -, /, *, sin, cos, exp} T = {integers in range (-5, 5), } Symbolic Regression Find function that best fits a number of sample points. Good fit determined by hits: candidate function within threshold distance size(d) 1 f (k) = h(k)" # e(k,i) max(h(k),1) i =1 e(k,i) = abs( ( v(k,x(i))" y(i) )) v(k,x) = Value of k th program for x h(k) = size(d ) " i =1 hits(k,i) 0 if e(k,i)# $ hits(k,i) = 1 otherwise 14
15 Symbolic Regression Example GP: Mathematically: 3x + sin(x) + Crossover + * cos sin cos * * * sin + * cos Problem: can disrupt useful couplings *- easily * * Adapting Crossover with ACO Use context-aware crossover Basic crossover chooses node randomly -- context unaware Adapt crossover to remember useful function couplings Not automatically defined functions (ADFs) 15
16 Function Coupling Matrix (C) Function + * sin cos * sin cos Important couplings have high values; e.g. sin-x Swarm-based GP (SB-GP) Three modifications to GP: 1. Initialization of Coupling matrix, C. 2. Crossover using coupling matrix. 3. Pheromone update based upon program fitness. Pheromone Initialization For all function and terminal coupling (i, j): Initialize pheromone, τ i,j, to initial value, τ 0 τ 0 is system parameter 16
17 Ant Colony (AC) Crossover Choose a random branch, B, from root to a leaf in program tree P n For every edge i, j in B Probability of choosing node i as root of subtree S n where i is parent and j is a child node is given by: p(i, n) = (τ max (n) - τ min (n) + τ i,j (n)) / Τ(n) Choose random branch, B, from root to a leaf in program tree P m For every edge i, j in B Probability of choosing node i as root of subtree S m p(i, m) = (τ max (m) - τ min (m) + τ i,j (m)) / Τ(m) where T(k) is given by: AC Crossover Continued T(k) = Σ i,j E(k) (τ max (k) - τ min (k) + τ i,j (k)) and τ i,j (k) = C(V(k,i),V(k,j)) τ max (k) = max i,j E(k) (τ i,j (k)) τ min (k) = min i,j E(k) (τ i,j (k)) and E(k) = { edges in k th program subtree } AC Crossover Example 17
18 Experimental Parameters Parameter Value Parameter Value Initial Pheromone 0 Evaporation rate p Best k programs used for evaluation Max Program Depth 10-5 Min Program Depth 0.9 Tournament Size 30 Crossover probability 15 Mutation Probability 1 Number of Generations (default) Functions and Results F1: cos( 2 )+sin( 2 )+ 2 F2: cos( 2 )+sin( 2 )+ 2 +cos()+sin() F3: sin()* 4 +sin()* 3 +sin()* 2 + sin()* Test GP Mean GP STD SB- GP Mean SB- GP STD P Value Population Size F F F F F F3: Function Couplings 18
19 Conclusions Statistically significant improvement in performance Useful couplings learnt Number of successful trials increased Couplings can saturate: Use ACS-style q mechanism to choose randomly some of time 19
An Ant Colony Based System for Data Mining: Applications to Medical Data
An Ant Colony Based System for Data Mining: Applications to Medical Data Rafael S. Parpinelli 1 Heitor S. Lopes 1 Alex A. Freitas 2 1 CEFET-PR, CPGEI Av. Sete de Setembro, 3165 Curitiba - PR, 80230-901
More informationAnt ColonyAlgorithms. for Data Classification
Ant ColonyAlgorithms for Data Classification Alex A. Freitas University of Kent, Canterbury, UK Rafael S. Parpinelli UDESC, Joinville, Brazil Heitor S. Lopes UTFPR, Curitiba, Brazil INTRODUCTION Ant Colony
More informationRepresenting structural patterns: Reading Material: Chapter 3 of the textbook by Witten
Representing structural patterns: Plain Classification rules Decision Tree Rules with exceptions Relational solution Tree for Numerical Prediction Instance-based presentation Reading Material: Chapter
More informationPASS EVALUATING IN SIMULATED SOCCER DOMAIN USING ANT-MINER ALGORITHM
PASS EVALUATING IN SIMULATED SOCCER DOMAIN USING ANT-MINER ALGORITHM Mohammad Ali Darvish Darab Qazvin Azad University Mechatronics Research Laboratory, Qazvin Azad University, Qazvin, Iran ali@armanteam.org
More informationData Mining Ant Colony for Classifiers
International Journal of Basic & Applied Sciences IJBAS-IJENS Vol: 10 No: 03 28 Data Mining Ant Colony for Classifiers Ahmed Sameh*, Khalid Magdy** *Department of Computer & Information Systems Prince
More informationIMPLEMENTATION OF ANT COLONY ALGORITHMS IN MATLAB R. Seidlová, J. Poživil
Abstract IMPLEMENTATION OF ANT COLONY ALGORITHMS IN MATLAB R. Seidlová, J. Poživil Institute of Chemical Technology, Department of Computing and Control Engineering Technická 5, Prague 6, 166 28, Czech
More informationClassification Using Unstructured Rules and Ant Colony Optimization
Classification Using Unstructured Rules and Ant Colony Optimization Negar Zakeri Nejad, Amir H. Bakhtiary, and Morteza Analoui Abstract In this paper a new method based on the algorithm is proposed to
More informationGenetic Algorithms and Genetic Programming Lecture 13
Genetic Algorithms and Genetic Programming Lecture 13 Gillian Hayes 9th November 2007 Ant Colony Optimisation and Bin Packing Problems Ant Colony Optimisation - review Pheromone Trail and Heuristic The
More informationUsing Genetic Algorithms to optimize ACS-TSP
Using Genetic Algorithms to optimize ACS-TSP Marcin L. Pilat and Tony White School of Computer Science, Carleton University, 1125 Colonel By Drive, Ottawa, ON, K1S 5B6, Canada {mpilat,arpwhite}@scs.carleton.ca
More informationChapter 4: Algorithms CS 795
Chapter 4: Algorithms CS 795 Inferring Rudimentary Rules 1R Single rule one level decision tree Pick each attribute and form a single level tree without overfitting and with minimal branches Pick that
More informationMidterm Examination CS540-2: Introduction to Artificial Intelligence
Midterm Examination CS540-2: Introduction to Artificial Intelligence March 15, 2018 LAST NAME: FIRST NAME: Problem Score Max Score 1 12 2 13 3 9 4 11 5 8 6 13 7 9 8 16 9 9 Total 100 Question 1. [12] Search
More informationDecision Tree Learning
Decision Tree Learning 1 Simple example of object classification Instances Size Color Shape C(x) x1 small red circle positive x2 large red circle positive x3 small red triangle negative x4 large blue circle
More informationUninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall
Midterm Exam Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall Covers topics through Decision Trees and Random Forests (does not include constraint satisfaction) Closed book 8.5 x 11 sheet with notes
More informationInducing Decision Trees with an Ant Colony Optimization Algorithm
To Appear in Applied Soft Computing (2012), DOI: 10.1016/j.asoc.2012.05.028 1 Inducing Decision Trees with an Ant Colony Optimization Algorithm Fernando E. B. Otero, Alex A. Freitas, Colin G. Johnson School
More informationClassification with Decision Tree Induction
Classification with Decision Tree Induction This algorithm makes Classification Decision for a test sample with the help of tree like structure (Similar to Binary Tree OR k-ary tree) Nodes in the tree
More informationAlgorithm in Classification Rule Mining. Application of a New m. Ant-Miner PR
Application of a New m Algorithm in Classification Rule Mining L. Yang 1*, K.S. Li 1, W.S. Zhang 2, Y. Wang 3, Z.X. Ke 1 1 College of Mathematics and Informatics South China Agricultural University, Guangzhou
More informationMETAHEURISTICS. Introduction. Introduction. Nature of metaheuristics. Local improvement procedure. Example: objective function
Introduction METAHEURISTICS Some problems are so complicated that are not possible to solve for an optimal solution. In these problems, it is still important to find a good feasible solution close to the
More informationA New Ant Colony Algorithm for Multi-Label Classification with Applications in Bioinformatics
A New Ant Colony Algorithm for Multi-Label Classification with Applications in Bioinformatics Allen Chan and Alex A. Freitas Computing Laboratory University of Kent Canterbury, CT2 7NZ, UK achan.83@googlemail.com,
More informationLECTURE 20: SWARM INTELLIGENCE 6 / ANT COLONY OPTIMIZATION 2
15-382 COLLECTIVE INTELLIGENCE - S18 LECTURE 20: SWARM INTELLIGENCE 6 / ANT COLONY OPTIMIZATION 2 INSTRUCTOR: GIANNI A. DI CARO ANT-ROUTING TABLE: COMBINING PHEROMONE AND HEURISTIC 2 STATE-TRANSITION:
More informationEvolutionary Algorithms. CS Evolutionary Algorithms 1
Evolutionary Algorithms CS 478 - Evolutionary Algorithms 1 Evolutionary Computation/Algorithms Genetic Algorithms l Simulate natural evolution of structures via selection and reproduction, based on performance
More informationDecision Tree CE-717 : Machine Learning Sharif University of Technology
Decision Tree CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adapted from: Prof. Tom Mitchell Decision tree Approximating functions of usually discrete
More informationCT79 SOFT COMPUTING ALCCS-FEB 2014
Q.1 a. Define Union, Intersection and complement operations of Fuzzy sets. For fuzzy sets A and B Figure Fuzzy sets A & B The union of two fuzzy sets A and B is a fuzzy set C, written as C=AUB or C=A OR
More informationA Hierarchical Multi-Label Classification Ant Colony Algorithm for Protein Function Prediction
Noname manuscript No. (will be inserted by the editor) Fernando E. B. Otero Alex A. Freitas Colin G. Johnson A Hierarchical Multi-Label Classification Ant Colony Algorithm for Protein Function Prediction
More informationNominal Data. May not have a numerical representation Distance measures might not make sense PR, ANN, & ML
Decision Trees Nominal Data So far we consider patterns to be represented by feature vectors of real or integer values Easy to come up with a distance (similarity) measure by using a variety of mathematical
More informationNominal Data. May not have a numerical representation Distance measures might not make sense. PR and ANN
NonMetric Data Nominal Data So far we consider patterns to be represented by feature vectors of real or integer values Easy to come up with a distance (similarity) measure by using a variety of mathematical
More informationChapter 4: Algorithms CS 795
Chapter 4: Algorithms CS 795 Inferring Rudimentary Rules 1R Single rule one level decision tree Pick each attribute and form a single level tree without overfitting and with minimal branches Pick that
More informationAn Efficient Analysis for High Dimensional Dataset Using K-Means Hybridization with Ant Colony Optimization Algorithm
An Efficient Analysis for High Dimensional Dataset Using K-Means Hybridization with Ant Colony Optimization Algorithm Prabha S. 1, Arun Prabha K. 2 1 Research Scholar, Department of Computer Science, Vellalar
More informationDecision Trees Dr. G. Bharadwaja Kumar VIT Chennai
Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target
More informationIntroduction to Machine Learning
Introduction to Machine Learning Decision Tree Example Three variables: Attribute 1: Hair = {blond, dark} Attribute 2: Height = {tall, short} Class: Country = {Gromland, Polvia} CS4375 --- Fall 2018 a
More informationSequential Covering Strategy Based Classification Approach Using Ant Colony Optimization
ISSN (Online) : 2319-8753 ISSN (Print) : 2347-6710 International Journal Innovative Research in Science, Engineering and Technology Volume 3, Special Issue 3, March 2014 2014 International Conference on
More informationAnt Colony Optimization: The Traveling Salesman Problem
Ant Colony Optimization: The Traveling Salesman Problem Section 2.3 from Swarm Intelligence: From Natural to Artificial Systems by Bonabeau, Dorigo, and Theraulaz Andrew Compton Ian Rogers 12/4/2006 Traveling
More informationA Survey of Evolutionary Algorithms for Data Mining and Knowledge Discovery
A Survey of Evolutionary Algorithms for Data Mining and Knowledge Discovery Alex A. Freitas Postgraduate Program in Computer Science, Pontificia Universidade Catolica do Parana Rua Imaculada Conceicao,
More informationMutations for Permutations
Mutations for Permutations Insert mutation: Pick two allele values at random Move the second to follow the first, shifting the rest along to accommodate Note: this preserves most of the order and adjacency
More informationAlgorithms: Decision Trees
Algorithms: Decision Trees A small dataset: Miles Per Gallon Suppose we want to predict MPG From the UCI repository A Decision Stump Recursion Step Records in which cylinders = 4 Records in which cylinders
More informationGenetic Programming for Data Classification: Partitioning the Search Space
Genetic Programming for Data Classification: Partitioning the Search Space Jeroen Eggermont jeggermo@liacs.nl Joost N. Kok joost@liacs.nl Walter A. Kosters kosters@liacs.nl ABSTRACT When Genetic Programming
More informationData Mining and Knowledge Discovery Practice notes 2
Keywords Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si Data Attribute, example, attribute-value data, target variable, class, discretization Algorithms
More informationData Mining Practical Machine Learning Tools and Techniques
Decision trees Extending previous approach: Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank to permit numeric s: straightforward
More informationGenetic Programming. Modern optimization methods 1
Genetic Programming Developed in USA during 90 s Patented by J. Koza Solves typical problems: Prediction, classification, approximation, programming Properties Competitor of neural networks Need for huge
More informationData Mining and Knowledge Discovery: Practice Notes
Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 8.11.2017 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization
More informationKNAPSACK BASED ACCS INFORMATION RETRIEVAL FRAMEWORK FOR BIO-MEDICAL LITERATURE USING SIMILARITY BASED CLUSTERING APPROACH.
KNAPSACK BASED ACCS INFORMATION RETRIEVAL FRAMEWORK FOR BIO-MEDICAL LITERATURE USING SIMILARITY BASED CLUSTERING APPROACH. 1 K.Latha 2 S.Archana 2 R.John Regies 3 Dr. Rajaram 1 Lecturer of Information
More informationDiscrete Particle Swarm Optimization With Local Search Strategy for Rule Classification
Discrete Particle Swarm Optimization With Local Search Strategy for Rule Classification Min Chen and Simone A. Ludwig Department of Computer Science North Dakota State University Fargo, ND, USA min.chen@my.ndsu.edu,
More informationFuzzy Ant Clustering by Centroid Positioning
Fuzzy Ant Clustering by Centroid Positioning Parag M. Kanade and Lawrence O. Hall Computer Science & Engineering Dept University of South Florida, Tampa FL 33620 @csee.usf.edu Abstract We
More informationAssociation Rule Learning
Association Rule Learning 16s1: COMP9417 Machine Learning and Data Mining School of Computer Science and Engineering, University of New South Wales March 15, 2016 COMP9417 ML & DM (CSE, UNSW) Association
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.4. Spring 2010 Instructor: Dr. Masoud Yaghini Outline Using IF-THEN Rules for Classification Rule Extraction from a Decision Tree 1R Algorithm Sequential Covering Algorithms
More informationData Mining and Knowledge Discovery: Practice Notes
Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 2016/11/16 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization
More informationKyrre Glette INF3490 Evolvable Hardware Cartesian Genetic Programming
Kyrre Glette kyrrehg@ifi INF3490 Evolvable Hardware Cartesian Genetic Programming Overview Introduction to Evolvable Hardware (EHW) Cartesian Genetic Programming Applications of EHW 3 Evolvable Hardware
More informationData Mining and Analytics
Data Mining and Analytics Aik Choon Tan, Ph.D. Associate Professor of Bioinformatics Division of Medical Oncology Department of Medicine aikchoon.tan@ucdenver.edu 9/22/2017 http://tanlab.ucdenver.edu/labhomepage/teaching/bsbt6111/
More informationIntroduction to Artificial Intelligence
Introduction to Artificial Intelligence COMP307 Evolutionary Computing 3: Genetic Programming for Regression and Classification Yi Mei yi.mei@ecs.vuw.ac.nz 1 Outline Statistical parameter regression Symbolic
More informationLocal Search (Ch )
Local Search (Ch. 4-4.1) Local search Before we tried to find a path from the start state to a goal state using a fringe set Now we will look at algorithms that do not care about a fringe, but just neighbors
More informationAnalytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.
Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied
More informationInstance-Based Representations. k-nearest Neighbor. k-nearest Neighbor. k-nearest Neighbor. exemplars + distance measure. Challenges.
Instance-Based Representations exemplars + distance measure Challenges. algorithm: IB1 classify based on majority class of k nearest neighbors learned structure is not explicitly represented choosing k
More informationCONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM
1 CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM John R. Koza Computer Science Department Stanford University Stanford, California 94305 USA E-MAIL: Koza@Sunburn.Stanford.Edu
More informationAnt colony optimization with genetic operations
Automation, Control and Intelligent Systems ; (): - Published online June, (http://www.sciencepublishinggroup.com/j/acis) doi:./j.acis.. Ant colony optimization with genetic operations Matej Ciba, Ivan
More informationCMPUT 391 Database Management Systems. Data Mining. Textbook: Chapter (without 17.10)
CMPUT 391 Database Management Systems Data Mining Textbook: Chapter 17.7-17.11 (without 17.10) University of Alberta 1 Overview Motivation KDD and Data Mining Association Rules Clustering Classification
More informationDecision tree learning
Decision tree learning Andrea Passerini passerini@disi.unitn.it Machine Learning Learning the concept Go to lesson OUTLOOK Rain Overcast Sunny TRANSPORTATION LESSON NO Uncovered Covered Theoretical Practical
More informationMidterm Examination CS 540-2: Introduction to Artificial Intelligence
Midterm Examination CS 54-2: Introduction to Artificial Intelligence March 9, 217 LAST NAME: FIRST NAME: Problem Score Max Score 1 15 2 17 3 12 4 6 5 12 6 14 7 15 8 9 Total 1 1 of 1 Question 1. [15] State
More informationSolving the Traveling Salesman Problem using Reinforced Ant Colony Optimization techniques
Solving the Traveling Salesman Problem using Reinforced Ant Colony Optimization techniques N.N.Poddar 1, D. Kaur 2 1 Electrical Engineering and Computer Science, University of Toledo, Toledo, OH, USA 2
More informationInducing Parameters of a Decision Tree for Expert System Shell McESE by Genetic Algorithm
Inducing Parameters of a Decision Tree for Expert System Shell McESE by Genetic Algorithm I. Bruha and F. Franek Dept of Computing & Software, McMaster University Hamilton, Ont., Canada, L8S4K1 Email:
More informationA Constrained-Syntax Genetic Programming System for Discovering Classification Rules: Application to Medical Data Sets
A Constrained-Syntax Genetic Programming System for Discovering Classification Rules: Application to Medical Data Sets Celia C. Bojarczuk 1 ; Heitor S. Lopes 2 ; Alex A. Freitas 3 ; Edson L. Michalkiewicz
More informationDecision trees. Decision trees are useful to a large degree because of their simplicity and interpretability
Decision trees A decision tree is a method for classification/regression that aims to ask a few relatively simple questions about an input and then predicts the associated output Decision trees are useful
More informationBusiness Club. Decision Trees
Business Club Decision Trees Business Club Analytics Team December 2017 Index 1. Motivation- A Case Study 2. The Trees a. What is a decision tree b. Representation 3. Regression v/s Classification 4. Building
More informationWhat is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.
What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem
More informationA Classifier with the Function-based Decision Tree
A Classifier with the Function-based Decision Tree Been-Chian Chien and Jung-Yi Lin Institute of Information Engineering I-Shou University, Kaohsiung 84008, Taiwan, R.O.C E-mail: cbc@isu.edu.tw, m893310m@isu.edu.tw
More informationCrew Scheduling Problem: A Column Generation Approach Improved by a Genetic Algorithm. Santos and Mateus (2007)
In the name of God Crew Scheduling Problem: A Column Generation Approach Improved by a Genetic Algorithm Spring 2009 Instructor: Dr. Masoud Yaghini Outlines Problem Definition Modeling As A Set Partitioning
More informationGA is the most popular population based heuristic algorithm since it was developed by Holland in 1975 [1]. This algorithm runs faster and requires les
Chaotic Crossover Operator on Genetic Algorithm Hüseyin Demirci Computer Engineering, Sakarya University, Sakarya, 54187, Turkey Ahmet Turan Özcerit Computer Engineering, Sakarya University, Sakarya, 54187,
More informationAdvanced learning algorithms
Advanced learning algorithms Extending decision trees; Extraction of good classification rules; Support vector machines; Weighted instance-based learning; Design of Model Tree Clustering Association Mining
More informationGenetic Programming. and its use for learning Concepts in Description Logics
Concepts in Description Artificial Intelligence Institute Computer Science Department Dresden Technical University May 29, 2006 Outline Outline: brief introduction to explanation of the workings of a algorithm
More informationAutomatic Programming with Ant Colony Optimization
Automatic Programming with Ant Colony Optimization Jennifer Green University of Kent jg9@kent.ac.uk Jacqueline L. Whalley University of Kent J.L.Whalley@kent.ac.uk Colin G. Johnson University of Kent C.G.Johnson@kent.ac.uk
More informationData Mining. 3.3 Rule-Based Classification. Fall Instructor: Dr. Masoud Yaghini. Rule-Based Classification
Data Mining 3.3 Fall 2008 Instructor: Dr. Masoud Yaghini Outline Using IF-THEN Rules for Classification Rules With Exceptions Rule Extraction from a Decision Tree 1R Algorithm Sequential Covering Algorithms
More informationA Hybrid PSO/ACO Algorithm for Classification
A Hybrid PSO/ACO Algorithm for Classification Nicholas Holden University of Kent Computing Laboratory Canterbury, CT2 7NF, UK +44 (0)1227 823192 nickpholden@gmail.com Alex A. Frietas University of Kent
More informationInternational Journal of Current Trends in Engineering & Technology Volume: 02, Issue: 01 (JAN-FAB 2016)
Survey on Ant Colony Optimization Shweta Teckchandani, Prof. Kailash Patidar, Prof. Gajendra Singh Sri Satya Sai Institute of Science & Technology, Sehore Madhya Pradesh, India Abstract Although ant is
More informationABC Optimization: A Co-Operative Learning Approach to Complex Routing Problems
Progress in Nonlinear Dynamics and Chaos Vol. 1, 2013, 39-46 ISSN: 2321 9238 (online) Published on 3 June 2013 www.researchmathsci.org Progress in ABC Optimization: A Co-Operative Learning Approach to
More informationAnt Colony Optimization
DM841 DISCRETE OPTIMIZATION Part 2 Heuristics Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline 1. earch 2. Context Inspiration from Nature 3. 4. 5.
More informationSIMULATION APPROACH OF CUTTING TOOL MOVEMENT USING ARTIFICIAL INTELLIGENCE METHOD
Journal of Engineering Science and Technology Special Issue on 4th International Technical Conference 2014, June (2015) 35-44 School of Engineering, Taylor s University SIMULATION APPROACH OF CUTTING TOOL
More informationData Analytics and Boolean Algebras
Data Analytics and Boolean Algebras Hans van Thiel November 28, 2012 c Muitovar 2012 KvK Amsterdam 34350608 Passeerdersstraat 76 1016 XZ Amsterdam The Netherlands T: + 31 20 6247137 E: hthiel@muitovar.com
More informationINTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá
INTRODUCTION TO DATA MINING Daniel Rodríguez, University of Alcalá Outline Knowledge Discovery in Datasets Model Representation Types of models Supervised Unsupervised Evaluation (Acknowledgement: Jesús
More informationDecision Trees: Discussion
Decision Trees: Discussion Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 This lecture: Learning Decision Trees 1. Representation: What are decision trees? 2. Algorithm: Learning
More informationDerivation of Relational Fuzzy Classification Rules Using Evolutionary Computation
Derivation of Relational Fuzzy Classification Rules Using Evolutionary Computation Vahab Akbarzadeh Alireza Sadeghian Marcus V. dos Santos Abstract An evolutionary system for derivation of fuzzy classification
More informationNovel Approach for Image Edge Detection
Novel Approach for Image Edge Detection Pankaj Valand 1, Mayurdhvajsinh Gohil 2, Pragnesh Patel 3 Assistant Professor, Electrical Engg. Dept., DJMIT, Mogar, Anand, Gujarat, India 1 Assistant Professor,
More informationCHAPTER 4 GENETIC ALGORITHM
69 CHAPTER 4 GENETIC ALGORITHM 4.1 INTRODUCTION Genetic Algorithms (GAs) were first proposed by John Holland (Holland 1975) whose ideas were applied and expanded on by Goldberg (Goldberg 1989). GAs is
More informationImproving Tree-Based Classification Rules Using a Particle Swarm Optimization
Improving Tree-Based Classification Rules Using a Particle Swarm Optimization Chi-Hyuck Jun *, Yun-Ju Cho, and Hyeseon Lee Department of Industrial and Management Engineering Pohang University of Science
More information7. Decision or classification trees
7. Decision or classification trees Next we are going to consider a rather different approach from those presented so far to machine learning that use one of the most common and important data structure,
More informationData Mining Concepts & Techniques
Data Mining Concepts & Techniques Lecture No. 03 Data Processing, Data Mining Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationThe k-means Algorithm and Genetic Algorithm
The k-means Algorithm and Genetic Algorithm k-means algorithm Genetic algorithm Rough set approach Fuzzy set approaches Chapter 8 2 The K-Means Algorithm The K-Means algorithm is a simple yet effective
More informationEscaping Local Optima: Genetic Algorithm
Artificial Intelligence Escaping Local Optima: Genetic Algorithm Dae-Won Kim School of Computer Science & Engineering Chung-Ang University We re trying to escape local optima To achieve this, we have learned
More informationA new improved ant colony algorithm with levy mutation 1
Acta Technica 62, No. 3B/2017, 27 34 c 2017 Institute of Thermomechanics CAS, v.v.i. A new improved ant colony algorithm with levy mutation 1 Zhang Zhixin 2, Hu Deji 2, Jiang Shuhao 2, 3, Gao Linhua 2,
More informationMachine Learning in Telecommunications
Machine Learning in Telecommunications Paulos Charonyktakis & Maria Plakia Department of Computer Science, University of Crete Institute of Computer Science, FORTH Roadmap Motivation Supervised Learning
More informationBest First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis
Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis CHAPTER 3 BEST FIRST AND GREEDY SEARCH BASED CFS AND NAÏVE BAYES ALGORITHMS FOR HEPATITIS DIAGNOSIS 3.1 Introduction
More informationSimplicial Global Optimization
Simplicial Global Optimization Julius Žilinskas Vilnius University, Lithuania September, 7 http://web.vu.lt/mii/j.zilinskas Global optimization Find f = min x A f (x) and x A, f (x ) = f, where A R n.
More informationGenetic Algorithms and Genetic Programming. Lecture 9: (23/10/09)
Genetic Algorithms and Genetic Programming Lecture 9: (23/10/09) Genetic programming II Michael Herrmann michael.herrmann@ed.ac.uk, phone: 0131 6 517177, Informatics Forum 1.42 Overview 1. Introduction:
More informationAn Ant Approach to the Flow Shop Problem
An Ant Approach to the Flow Shop Problem Thomas Stützle TU Darmstadt, Computer Science Department Alexanderstr. 10, 64283 Darmstadt Phone: +49-6151-166651, Fax +49-6151-165326 email: stuetzle@informatik.tu-darmstadt.de
More information9/6/14. Our first learning algorithm. Comp 135 Introduction to Machine Learning and Data Mining. knn Algorithm. knn Algorithm (simple form)
Comp 135 Introduction to Machine Learning and Data Mining Our first learning algorithm How would you classify the next example? Fall 2014 Professor: Roni Khardon Computer Science Tufts University o o o
More informationAssociation Rule Mining and Clustering
Association Rule Mining and Clustering Lecture Outline: Classification vs. Association Rule Mining vs. Clustering Association Rule Mining Clustering Types of Clusters Clustering Algorithms Hierarchical:
More informationGradient Descent. 1) S! initial state 2) Repeat: Similar to: - hill climbing with h - gradient descent over continuous space
Local Search 1 Local Search Light-memory search method No search tree; only the current state is represented! Only applicable to problems where the path is irrelevant (e.g., 8-queen), unless the path is
More informationData Mining. Covering algorithms. Covering approach At each stage you identify a rule that covers some of instances. Fig. 4.
Data Mining Chapter 4. Algorithms: The Basic Methods (Covering algorithm, Association rule, Linear models, Instance-based learning, Clustering) 1 Covering approach At each stage you identify a rule that
More informationAutomatic Design of Ant Algorithms with Grammatical Evolution
Automatic Design of Ant Algorithms with Grammatical Evolution Jorge Tavares 1 and Francisco B. Pereira 1,2 CISUC, Department of Informatics Engineering, University of Coimbra Polo II - Pinhal de Marrocos,
More informationMachine Learning. Decision Trees. Manfred Huber
Machine Learning Decision Trees Manfred Huber 2015 1 Decision Trees Classifiers covered so far have been Non-parametric (KNN) Probabilistic with independence (Naïve Bayes) Linear in features (Logistic
More informationAn Evolutionary Algorithm for Minimizing Multimodal Functions
An Evolutionary Algorithm for Minimizing Multimodal Functions D.G. Sotiropoulos, V.P. Plagianakos and M.N. Vrahatis University of Patras, Department of Mamatics, Division of Computational Mamatics & Informatics,
More informationRelationship between Genetic Algorithms and Ant Colony Optimization Algorithms
Relationship between Genetic Algorithms and Ant Colony Optimization Algorithms Osvaldo Gómez Universidad Nacional de Asunción Centro Nacional de Computación Asunción, Paraguay ogomez@cnc.una.py and Benjamín
More informationCS Machine Learning
CS 60050 Machine Learning Decision Tree Classifier Slides taken from course materials of Tan, Steinbach, Kumar 10 10 Illustrating Classification Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K
More information