The k-means Algorithm and Genetic Algorithm

Size: px
Start display at page:

Download "The k-means Algorithm and Genetic Algorithm"

Transcription

1 The k-means Algorithm and Genetic Algorithm

2 k-means algorithm Genetic algorithm Rough set approach Fuzzy set approaches Chapter 8 2

3 The K-Means Algorithm The K-Means algorithm is a simple yet effective statistical clustering technique. Here is the algorithm: 1. Choose a value for K, the total number of clusters to be determined. 2. Choose K instances (data points) within the dataset at random. These are the initial cluster centers. 3. Use simple Euclidean distance to assign the remaining instances to their closest cluster center. Chapter 8 3

4 The K-Means Algorithm 4. Use the instances in each cluster to calculate a new mean for each cluster. 5. If the new mean values are identical to the mean values of the previous iteration the process terminates. Otherwise, use the new means as cluster centers and repeat steps 3-5. Chapter 8 4

5 The K-Means Algorithm An Example Using K-Means Chapter 8 5

6 The K-Means Algorithm An Example Using K-Means Chapter 8 6

7 The K-Means Algorithm General Considerations Chapter 8 7

8 The K-Means Algorithm General Considerations Chapter 8 8

9 All instances correspond to points in the n-d space. The nearest neighbor are defined in terms of Euclidean distance. The target function could be discrete- or realvalued _ xq _ Chapter 8 9

10 For discrete-valued, the k-nn returns the most common value among the k training examples nearest to xq. Vonoroi diagram: the decision surface induced by 1-NN for a typical set of training examples _ xq _ Chapter 8 10

11 The k-nn algorithm for continuous-valued target functions Calculate the mean values of the k nearest neighbors Distance-weighted nearest neighbor algorithm Weight the contribution of each of the k neighbors according to their distance to the query point x q giving greater weight to closer neighbors Similarly, for real-valued 1 target functions w d ( xq, x i ) 2 Chapter 8 11

12 Genetic Learning Here we present a basic genetic learning algorithm. 1. Initialize a population P of n elements, often referred to as chromosomes, as a potential solution. 2. Until a specified termination condition is satisfied: a. Use a fitness function to evaluate each element of the current solution. If an element passes the fitness criteria, it remains in P. b. The population now contains m elements (m<=n). Use genetic operators to create (n-m) new elements. Add the new elements to the population. Chapter 8 12

13 Genetic Learning Genetic Algorithms and Supervised Learning Chapter 8 13

14 Genetic Learning Genetic Algorithms and Supervised Learning Chapter 8 14

15 Genetic Learning Genetic Algorithms and Supervised Learning Chapter 8 15

16 Genetic Learning Genetic Algorithms and Supervised Learning Chapter 8 16

17 Genetic Learning Genetic Algorithms and... Supervised Learning Chapter 8 17

18 Genetic Learning Genetic Algorithms and..unsupervised Clustering Chapter 8 18

19 Genetic Learning Genetic Algorithms and Unsupervised Clustering Chapter 8 19

20 Genetic Learning General Considerations Here is a list of considerations when using a problem-solving approach based on genetic learning: Genetic algorithms are designed to find globally optimized solutions. However, there is no guarantee that any given solution is not the result of a local rather than a global optimization. The fitness function determines the computational complexity of a genetic algorithm. A fitness function involving several calculations can be computationally expensive. Chapter 8 20

21 Genetic Learning General Considerations Genetic algorithms explain their results to the extent that the fitness function is understandable. Transforming the data to form suitable for a genetic algorithm can be a challenge. Chapter 8 21

22 GA: based on an analogy to biological evolution Each rule is represented by a string of bits An initial population is created consisting of randomly generated rules Based on the notion of survival of the fittest, a new population is formed to consists of the fittest rules and their offsprings The fitness of a rule is represented by its classification accuracy on a set of training examples Offsprings are generated by crossover and mutation Chapter 8 22

23 Population-based technique for discovery of...knowledge structures Based on idea that evolution represents search for optimum solution set Massively parallel Chapter 8 23

24 Population Set of individuals, each represented by one or more strings of characters Chromosome The string representing an individual Chapter 8 24

25 Chromosome Gene (Allele="0") Locus=5 Gene The basic informational unit on a chromosome Allele :The value of a specific gene Locus : The ordinal place... on a chromosome where a specific gene is found Chapter 8 25

26 Reproduction Increase representations of strong individuals Crossover Explore the search space Mutation Recapture lost genes due to crossover Chapter 8 26

27 Parent 1: Parent 2: Simple reproduction Offspring 1: Offspring 2: Parent 1: Parent 2: Reproduction with Offspring crossover at locus 3 1: Offspring 2: Parent 1: Parent 2: Offspring 1: Simple reproduction with mutation at locus 3 for offspring 1 Offspring 2: Chapter 8 27

28 Ability of an individual to survive into the next generation Survival of the fittest Usually calculated in terms of an objective fitness function Maximization Minimization Other functions Chapter 8 28

29 Based on adaptation and evolution Structures undergoing adaptation are computer programs of varying size and shape Computer programs are genetically bred over time Chapter 8 29

30 Rule-based knowledge discovery and concept learning tool Operates by means of evaluation, credit assignment, and discovery applied to a population of chromosomes (rules) each with a corresponding phenotype (outcome) Chapter 8 30

31 Performance Provides interaction between environment and rule base Performs matching function Reinforcement Rewards accurate classifiers Punishes inaccurate classifiers Discovery Uses the genetic algorithm to search for plausible rules Chapter 8 31

32 Rough sets are used to approximately or roughly define equivalent classes A rough set for a given class C is approximated by two sets: a lower approximation (certain to be in C) an upper approximation (cannot be described as not belonging to C) Chapter 8 32

33 Fuzzy logic uses truth values between 0.0 and 1.0 to represent the degree of membership (such as using fuzzy membership graph) Attribute values are converted to fuzzy values e.g., income is mapped into the discrete categories {low, medium, high} with fuzzy values calculated Chapter 8 33

34 For a given new sample, more than one fuzzy value may apply Each applicable rule contributes a vote for membership in the categories Typically, the truth values for each predicted category are summed. Chapter 8 34

35 Chapter Summary The K-Means algorithm is a statistical unsupervised clustering technique. All input attributes to the algorithm must be numeric and the user is required to make a decision about... how many clusters are to be discovered. The algorithm begins by randomly choosing one data point to represent each cluster. Each data instance is then placed in the cluster to which it is most similar. New cluster centers are computed and the process continues until...the cluster centers do not change. Chapter 8 35

36 Chapter Summary The K-Means algorithm is easy to implement and understand. However, the algorithm is not guaranteed to converge to a globally optimal solution, lacks the ability to explain what has been found, unable to tell which attributes are significant in determining the formed clusters. Despite these limitations, the K-Means algorithm is among the most widely used clustering techniques. Chapter 8 36

37 Chapter Summary Genetic algorithms apply the theory of evolution to inductive learning. Genetic learning can be supervised...or...unsupervised typically used for problems that cannot be solved with traditional techniques. A standard genetic approach to learning applies a fitness function to a set of data elements to determine... which elements survive from one generation to the next. Chapter 8 37

38 Chapter Summary Those elements not surviving are used to create new instances to replace deleted elements. In addition to being used for supervised learning and unsupervised clustering, genetic techniques can be employed in conjunction with other learning techniques. Chapter 8 38

39 Key Terms Affinity analysis. The process of determining which things are typically grouped together. Confidence. Given a rule of the form If A then B, confidence is defined as the conditional probability that B is true when A is known to be true. Crossover. A genetic learning operation that creates new population elements by combining parts of two or more elements from the current population. Chapter 8 39

40 Key Terms Genetic algorithm. A data mining technique based on the theory of evolution. Mutation. A genetic learning operation that creates a new population element by randomly modifying a portion of an existing element. Selection. A genetic learning operation that adds copies of current population elements with high fitness scores to the next generation of the population. Chapter 8 40

41 Data Mining: Concepts and Techniques (Chapter 7 Slide for textbook), Jiawei Han and Micheline Kamber, Intelligent Database Systems Research Lab, School of Computing Science, Simon Fraser University, Canada Chapter 8 41

Basic Data Mining Technique

Basic Data Mining Technique Basic Data Mining Technique What is classification? What is prediction? Supervised and Unsupervised Learning Decision trees Association rule K-nearest neighbor classifier Case-based reasoning Genetic algorithm

More information

Preprocessing of Stream Data using Attribute Selection based on Survival of the Fittest

Preprocessing of Stream Data using Attribute Selection based on Survival of the Fittest Preprocessing of Stream Data using Attribute Selection based on Survival of the Fittest Bhakti V. Gavali 1, Prof. Vivekanand Reddy 2 1 Department of Computer Science and Engineering, Visvesvaraya Technological

More information

CHAPTER 5 ENERGY MANAGEMENT USING FUZZY GENETIC APPROACH IN WSN

CHAPTER 5 ENERGY MANAGEMENT USING FUZZY GENETIC APPROACH IN WSN 97 CHAPTER 5 ENERGY MANAGEMENT USING FUZZY GENETIC APPROACH IN WSN 5.1 INTRODUCTION Fuzzy systems have been applied to the area of routing in ad hoc networks, aiming to obtain more adaptive and flexible

More information

Unsupervised Learning

Unsupervised Learning Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised

More information

Optimization of Association Rule Mining through Genetic Algorithm

Optimization of Association Rule Mining through Genetic Algorithm Optimization of Association Rule Mining through Genetic Algorithm RUPALI HALDULAKAR School of Information Technology, Rajiv Gandhi Proudyogiki Vishwavidyalaya Bhopal, Madhya Pradesh India Prof. JITENDRA

More information

Evolutionary Algorithms. CS Evolutionary Algorithms 1

Evolutionary Algorithms. CS Evolutionary Algorithms 1 Evolutionary Algorithms CS 478 - Evolutionary Algorithms 1 Evolutionary Computation/Algorithms Genetic Algorithms l Simulate natural evolution of structures via selection and reproduction, based on performance

More information

Introduction to Evolutionary Computation

Introduction to Evolutionary Computation Introduction to Evolutionary Computation The Brought to you by (insert your name) The EvoNet Training Committee Some of the Slides for this lecture were taken from the Found at: www.cs.uh.edu/~ceick/ai/ec.ppt

More information

Topic 1 Classification Alternatives

Topic 1 Classification Alternatives Topic 1 Classification Alternatives [Jiawei Han, Micheline Kamber, Jian Pei. 2011. Data Mining Concepts and Techniques. 3 rd Ed. Morgan Kaufmann. ISBN: 9380931913.] 1 Contents 2. Classification Using Frequent

More information

Heuristic Optimisation

Heuristic Optimisation Heuristic Optimisation Part 10: Genetic Algorithm Basics Sándor Zoltán Németh http://web.mat.bham.ac.uk/s.z.nemeth s.nemeth@bham.ac.uk University of Birmingham S Z Németh (s.nemeth@bham.ac.uk) Heuristic

More information

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM 1 CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM John R. Koza Computer Science Department Stanford University Stanford, California 94305 USA E-MAIL: Koza@Sunburn.Stanford.Edu

More information

MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS

MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS J.I. Serrano M.D. Del Castillo Instituto de Automática Industrial CSIC. Ctra. Campo Real km.0 200. La Poveda. Arganda del Rey. 28500

More information

Advanced Search Genetic algorithm

Advanced Search Genetic algorithm Advanced Search Genetic algorithm Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [Based on slides from Jerry Zhu, Andrew Moore http://www.cs.cmu.edu/~awm/tutorials

More information

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES 6.1 INTRODUCTION The exploration of applications of ANN for image classification has yielded satisfactory results. But, the scope for improving

More information

Monika Maharishi Dayanand University Rohtak

Monika Maharishi Dayanand University Rohtak Performance enhancement for Text Data Mining using k means clustering based genetic optimization (KMGO) Monika Maharishi Dayanand University Rohtak ABSTRACT For discovering hidden patterns and structures

More information

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset. Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied

More information

Genetic Programming. Charles Chilaka. Department of Computational Science Memorial University of Newfoundland

Genetic Programming. Charles Chilaka. Department of Computational Science Memorial University of Newfoundland Genetic Programming Charles Chilaka Department of Computational Science Memorial University of Newfoundland Class Project for Bio 4241 March 27, 2014 Charles Chilaka (MUN) Genetic algorithms and programming

More information

Evolving SQL Queries for Data Mining

Evolving SQL Queries for Data Mining Evolving SQL Queries for Data Mining Majid Salim and Xin Yao School of Computer Science, The University of Birmingham Edgbaston, Birmingham B15 2TT, UK {msc30mms,x.yao}@cs.bham.ac.uk Abstract. This paper

More information

Introduction to Genetic Algorithms. Based on Chapter 10 of Marsland Chapter 9 of Mitchell

Introduction to Genetic Algorithms. Based on Chapter 10 of Marsland Chapter 9 of Mitchell Introduction to Genetic Algorithms Based on Chapter 10 of Marsland Chapter 9 of Mitchell Genetic Algorithms - History Pioneered by John Holland in the 1970s Became popular in the late 1980s Based on ideas

More information

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra Pattern Recall Analysis of the Hopfield Neural Network with a Genetic Algorithm Susmita Mohapatra Department of Computer Science, Utkal University, India Abstract: This paper is focused on the implementation

More information

Machine Learning: Algorithms and Applications Mockup Examination

Machine Learning: Algorithms and Applications Mockup Examination Machine Learning: Algorithms and Applications Mockup Examination 14 May 2012 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students Write First Name, Last Name, Student Number and Signature

More information

Introduction to Design Optimization: Search Methods

Introduction to Design Optimization: Search Methods Introduction to Design Optimization: Search Methods 1-D Optimization The Search We don t know the curve. Given α, we can calculate f(α). By inspecting some points, we try to find the approximated shape

More information

RESOLVING AMBIGUITIES IN PREPOSITION PHRASE USING GENETIC ALGORITHM

RESOLVING AMBIGUITIES IN PREPOSITION PHRASE USING GENETIC ALGORITHM International Journal of Computer Engineering and Applications, Volume VIII, Issue III, December 14 RESOLVING AMBIGUITIES IN PREPOSITION PHRASE USING GENETIC ALGORITHM Department of Computer Engineering,

More information

Grid-Based Genetic Algorithm Approach to Colour Image Segmentation

Grid-Based Genetic Algorithm Approach to Colour Image Segmentation Grid-Based Genetic Algorithm Approach to Colour Image Segmentation Marco Gallotta Keri Woods Supervised by Audrey Mbogho Image Segmentation Identifying and extracting distinct, homogeneous regions from

More information

Lecture-17: Clustering with K-Means (Contd: DT + Random Forest)

Lecture-17: Clustering with K-Means (Contd: DT + Random Forest) Lecture-17: Clustering with K-Means (Contd: DT + Random Forest) Medha Vidyotma April 24, 2018 1 Contd. Random Forest For Example, if there are 50 scholars who take the measurement of the length of the

More information

Neural Network Weight Selection Using Genetic Algorithms

Neural Network Weight Selection Using Genetic Algorithms Neural Network Weight Selection Using Genetic Algorithms David Montana presented by: Carl Fink, Hongyi Chen, Jack Cheng, Xinglong Li, Bruce Lin, Chongjie Zhang April 12, 2005 1 Neural Networks Neural networks

More information

GENETIC ALGORITHM with Hands-On exercise

GENETIC ALGORITHM with Hands-On exercise GENETIC ALGORITHM with Hands-On exercise Adopted From Lecture by Michael Negnevitsky, Electrical Engineering & Computer Science University of Tasmania 1 Objective To understand the processes ie. GAs Basic

More information

Neuro-fuzzy, GA-Fuzzy, Neural-Fuzzy-GA: A Data Mining Technique for Optimization

Neuro-fuzzy, GA-Fuzzy, Neural-Fuzzy-GA: A Data Mining Technique for Optimization International Journal of Computer Science and Software Engineering Volume 3, Number 1 (2017), pp. 1-9 International Research Publication House http://www.irphouse.com Neuro-fuzzy, GA-Fuzzy, Neural-Fuzzy-GA:

More information

A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2

A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2 Chapter 5 A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2 Graph Matching has attracted the exploration of applying new computing paradigms because of the large number of applications

More information

Review on Data Mining Techniques for Intrusion Detection System

Review on Data Mining Techniques for Intrusion Detection System Review on Data Mining Techniques for Intrusion Detection System Sandeep D 1, M. S. Chaudhari 2 Research Scholar, Dept. of Computer Science, P.B.C.E, Nagpur, India 1 HoD, Dept. of Computer Science, P.B.C.E,

More information

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM 20 CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM 2.1 CLASSIFICATION OF CONVENTIONAL TECHNIQUES Classical optimization methods can be classified into two distinct groups:

More information

Maharashtra, India. I. INTRODUCTION. A. Data Mining

Maharashtra, India. I. INTRODUCTION. A. Data Mining Knowledge Discovery in Data Mining Using Soft Computing Techniques A Comparative Analysis 1 Rasika Kuware and 2 V.P.Mahatme, 1 Student (MTech), 2 Associate Professor, 1,2 Computer Science and Engineering

More information

IJMIE Volume 2, Issue 9 ISSN:

IJMIE Volume 2, Issue 9 ISSN: Dimensionality Using Optimization Algorithm for High Dimensional Data Clustering Saranya.S* Dr.Punithavalli.M** Abstract: This paper present an efficient approach to a feature selection problem based on

More information

Inducing Parameters of a Decision Tree for Expert System Shell McESE by Genetic Algorithm

Inducing Parameters of a Decision Tree for Expert System Shell McESE by Genetic Algorithm Inducing Parameters of a Decision Tree for Expert System Shell McESE by Genetic Algorithm I. Bruha and F. Franek Dept of Computing & Software, McMaster University Hamilton, Ont., Canada, L8S4K1 Email:

More information

Genetic Algorithm for Finding Shortest Path in a Network

Genetic Algorithm for Finding Shortest Path in a Network Intern. J. Fuzzy Mathematical Archive Vol. 2, 2013, 43-48 ISSN: 2320 3242 (P), 2320 3250 (online) Published on 26 August 2013 www.researchmathsci.org International Journal of Genetic Algorithm for Finding

More information

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter Review: Final Exam Model for a Learning Step Learner initially Environm ent Teacher Compare s pe c ia l Information Control Correct Learning criteria Feedback changed Learner after Learning Learning by

More information

K Nearest Neighbor Wrap Up K- Means Clustering. Slides adapted from Prof. Carpuat

K Nearest Neighbor Wrap Up K- Means Clustering. Slides adapted from Prof. Carpuat K Nearest Neighbor Wrap Up K- Means Clustering Slides adapted from Prof. Carpuat K Nearest Neighbor classification Classification is based on Test instance with Training Data K: number of neighbors that

More information

Unsupervised Learning: Clustering

Unsupervised Learning: Clustering Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning

More information

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska Classification Lecture Notes cse352 Neural Networks Professor Anita Wasilewska Neural Networks Classification Introduction INPUT: classification data, i.e. it contains an classification (class) attribute

More information

Automated Test Data Generation and Optimization Scheme Using Genetic Algorithm

Automated Test Data Generation and Optimization Scheme Using Genetic Algorithm 2011 International Conference on Software and Computer Applications IPCSIT vol.9 (2011) (2011) IACSIT Press, Singapore Automated Test Data Generation and Optimization Scheme Using Genetic Algorithm Roshni

More information

Application of Genetic Algorithm Based Intuitionistic Fuzzy k-mode for Clustering Categorical Data

Application of Genetic Algorithm Based Intuitionistic Fuzzy k-mode for Clustering Categorical Data BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 4 Sofia 2017 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2017-0044 Application of Genetic Algorithm

More information

Unsupervised Learning : Clustering

Unsupervised Learning : Clustering Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex

More information

Applying genetic algorithm on power system stabilizer for stabilization of power system

Applying genetic algorithm on power system stabilizer for stabilization of power system Applying genetic algorithm on power system stabilizer for stabilization of power system 1,3 Arnawan Hasibuan and 2,3 Syafrudin 1 Engineering Department of Malikussaleh University, Lhokseumawe, Indonesia;

More information

Chapter 28. Outline. Definitions of Data Mining. Data Mining Concepts

Chapter 28. Outline. Definitions of Data Mining. Data Mining Concepts Chapter 28 Data Mining Concepts Outline Data Mining Data Warehousing Knowledge Discovery in Databases (KDD) Goals of Data Mining and Knowledge Discovery Association Rules Additional Data Mining Algorithms

More information

A Survey of Evolutionary Algorithms for Data Mining and Knowledge Discovery

A Survey of Evolutionary Algorithms for Data Mining and Knowledge Discovery A Survey of Evolutionary Algorithms for Data Mining and Knowledge Discovery Alex A. Freitas Postgraduate Program in Computer Science, Pontificia Universidade Catolica do Parana Rua Imaculada Conceicao,

More information

Lecture 8: Genetic Algorithms

Lecture 8: Genetic Algorithms Lecture 8: Genetic Algorithms Cognitive Systems - Machine Learning Part II: Special Aspects of Concept Learning Genetic Algorithms, Genetic Programming, Models of Evolution last change December 1, 2010

More information

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a International Conference on Education Technology, Management and Humanities Science (ETMHS 2015) Research on Applications of Data Mining in Electronic Commerce Xiuping YANG 1, a 1 Computer Science Department,

More information

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler BBS654 Data Mining Pinar Duygulu Slides are adapted from Nazli Ikizler 1 Classification Classification systems: Supervised learning Make a rational prediction given evidence There are several methods for

More information

Time Complexity Analysis of the Genetic Algorithm Clustering Method

Time Complexity Analysis of the Genetic Algorithm Clustering Method Time Complexity Analysis of the Genetic Algorithm Clustering Method Z. M. NOPIAH, M. I. KHAIRIR, S. ABDULLAH, M. N. BAHARIN, and A. ARIFIN Department of Mechanical and Materials Engineering Universiti

More information

CLUSTERING. CSE 634 Data Mining Prof. Anita Wasilewska TEAM 16

CLUSTERING. CSE 634 Data Mining Prof. Anita Wasilewska TEAM 16 CLUSTERING CSE 634 Data Mining Prof. Anita Wasilewska TEAM 16 1. K-medoids: REFERENCES https://www.coursera.org/learn/cluster-analysis/lecture/nj0sb/3-4-the-k-medoids-clustering-method https://anuradhasrinivas.files.wordpress.com/2013/04/lesson8-clustering.pdf

More information

Escaping Local Optima: Genetic Algorithm

Escaping Local Optima: Genetic Algorithm Artificial Intelligence Escaping Local Optima: Genetic Algorithm Dae-Won Kim School of Computer Science & Engineering Chung-Ang University We re trying to escape local optima To achieve this, we have learned

More information

Mutations for Permutations

Mutations for Permutations Mutations for Permutations Insert mutation: Pick two allele values at random Move the second to follow the first, shifting the rest along to accommodate Note: this preserves most of the order and adjacency

More information

Classification of Concept-Drifting Data Streams using Optimized Genetic Algorithm

Classification of Concept-Drifting Data Streams using Optimized Genetic Algorithm Classification of Concept-Drifting Data Streams using Optimized Genetic Algorithm E. Padmalatha Asst.prof CBIT C.R.K. Reddy, PhD Professor CBIT B. Padmaja Rani, PhD Professor JNTUH ABSTRACT Data Stream

More information

Evolutionary Computation. Chao Lan

Evolutionary Computation. Chao Lan Evolutionary Computation Chao Lan Outline Introduction Genetic Algorithm Evolutionary Strategy Genetic Programming Introduction Evolutionary strategy can jointly optimize multiple variables. - e.g., max

More information

Introduction to Genetic Algorithms. Genetic Algorithms

Introduction to Genetic Algorithms. Genetic Algorithms Introduction to Genetic Algorithms Genetic Algorithms We ve covered enough material that we can write programs that use genetic algorithms! More advanced example of using arrays Could be better written

More information

SOCIAL MEDIA MINING. Data Mining Essentials

SOCIAL MEDIA MINING. Data Mining Essentials SOCIAL MEDIA MINING Data Mining Essentials Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate

More information

Introduction to Design Optimization: Search Methods

Introduction to Design Optimization: Search Methods Introduction to Design Optimization: Search Methods 1-D Optimization The Search We don t know the curve. Given α, we can calculate f(α). By inspecting some points, we try to find the approximated shape

More information

What to come. There will be a few more topics we will cover on supervised learning

What to come. There will be a few more topics we will cover on supervised learning Summary so far Supervised learning learn to predict Continuous target regression; Categorical target classification Linear Regression Classification Discriminative models Perceptron (linear) Logistic regression

More information

Clustering Analysis of Simple K Means Algorithm for Various Data Sets in Function Optimization Problem (Fop) of Evolutionary Programming

Clustering Analysis of Simple K Means Algorithm for Various Data Sets in Function Optimization Problem (Fop) of Evolutionary Programming Clustering Analysis of Simple K Means Algorithm for Various Data Sets in Function Optimization Problem (Fop) of Evolutionary Programming R. Karthick 1, Dr. Malathi.A 2 Research Scholar, Department of Computer

More information

Naïve Bayes for text classification

Naïve Bayes for text classification Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support

More information

A Naïve Soft Computing based Approach for Gene Expression Data Analysis

A Naïve Soft Computing based Approach for Gene Expression Data Analysis Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2124 2128 International Conference on Modeling Optimization and Computing (ICMOC-2012) A Naïve Soft Computing based Approach for

More information

The Genetic Algorithm for finding the maxima of single-variable functions

The Genetic Algorithm for finding the maxima of single-variable functions Research Inventy: International Journal Of Engineering And Science Vol.4, Issue 3(March 2014), PP 46-54 Issn (e): 2278-4721, Issn (p):2319-6483, www.researchinventy.com The Genetic Algorithm for finding

More information

Genetic Algorithms for Vision and Pattern Recognition

Genetic Algorithms for Vision and Pattern Recognition Genetic Algorithms for Vision and Pattern Recognition Faiz Ul Wahab 11/8/2014 1 Objective To solve for optimization of computer vision problems using genetic algorithms 11/8/2014 2 Timeline Problem: Computer

More information

Artificial Intelligence Application (Genetic Algorithm)

Artificial Intelligence Application (Genetic Algorithm) Babylon University College of Information Technology Software Department Artificial Intelligence Application (Genetic Algorithm) By Dr. Asaad Sabah Hadi 2014-2015 EVOLUTIONARY ALGORITHM The main idea about

More information

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall Midterm Exam Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall Covers topics through Decision Trees and Random Forests (does not include constraint satisfaction) Closed book 8.5 x 11 sheet with notes

More information

Data Preprocessing. Supervised Learning

Data Preprocessing. Supervised Learning Supervised Learning Regression Given the value of an input X, the output Y belongs to the set of real values R. The goal is to predict output accurately for a new input. The predictions or outputs y are

More information

Artificial Intelligence. Programming Styles

Artificial Intelligence. Programming Styles Artificial Intelligence Intro to Machine Learning Programming Styles Standard CS: Explicitly program computer to do something Early AI: Derive a problem description (state) and use general algorithms to

More information

Reducing Graphic Conflict In Scale Reduced Maps Using A Genetic Algorithm

Reducing Graphic Conflict In Scale Reduced Maps Using A Genetic Algorithm Reducing Graphic Conflict In Scale Reduced Maps Using A Genetic Algorithm Dr. Ian D. Wilson School of Technology, University of Glamorgan, Pontypridd CF37 1DL, UK Dr. J. Mark Ware School of Computing,

More information

Introduction to Genetic Algorithms

Introduction to Genetic Algorithms Advanced Topics in Image Analysis and Machine Learning Introduction to Genetic Algorithms Week 3 Faculty of Information Science and Engineering Ritsumeikan University Today s class outline Genetic Algorithms

More information

Genetic Algorithms for Classification and Feature Extraction

Genetic Algorithms for Classification and Feature Extraction Genetic Algorithms for Classification and Feature Extraction Min Pei, Erik D. Goodman, William F. Punch III and Ying Ding, (1995), Genetic Algorithms For Classification and Feature Extraction, Michigan

More information

IDS Using Machine Learning Techniques

IDS Using Machine Learning Techniques Overview IDS Using Machine Learning Techniques COMP 290-40 Brian Begnoche March 23, 2005 What is ML? Why use ML with IDS? ML methods 3 examples ML methods 2 examples Using ML to improve existing NIDSs

More information

Knowledge Discovery using PSO and DE Techniques

Knowledge Discovery using PSO and DE Techniques 60 CHAPTER 4 KNOWLEDGE DISCOVERY USING PSO AND DE TECHNIQUES 61 Knowledge Discovery using PSO and DE Techniques 4.1 Introduction In the recent past, there has been an enormous increase in the amount of

More information

Unsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Unsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing Unsupervised Data Mining: Clustering Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 1. Supervised Data Mining Classification Regression Outlier detection

More information

Suppose you have a problem You don t know how to solve it What can you do? Can you use a computer to somehow find a solution for you?

Suppose you have a problem You don t know how to solve it What can you do? Can you use a computer to somehow find a solution for you? Gurjit Randhawa Suppose you have a problem You don t know how to solve it What can you do? Can you use a computer to somehow find a solution for you? This would be nice! Can it be done? A blind generate

More information

Clustering & Classification (chapter 15)

Clustering & Classification (chapter 15) Clustering & Classification (chapter 5) Kai Goebel Bill Cheetham RPI/GE Global Research goebel@cs.rpi.edu cheetham@cs.rpi.edu Outline k-means Fuzzy c-means Mountain Clustering knn Fuzzy knn Hierarchical

More information

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Unsupervised Learning Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Content Motivation Introduction Applications Types of clustering Clustering criterion functions Distance functions Normalization Which

More information

Redefining and Enhancing K-means Algorithm

Redefining and Enhancing K-means Algorithm Redefining and Enhancing K-means Algorithm Nimrat Kaur Sidhu 1, Rajneet kaur 2 Research Scholar, Department of Computer Science Engineering, SGGSWU, Fatehgarh Sahib, Punjab, India 1 Assistant Professor,

More information

A Parallel Evolutionary Algorithm for Discovery of Decision Rules

A Parallel Evolutionary Algorithm for Discovery of Decision Rules A Parallel Evolutionary Algorithm for Discovery of Decision Rules Wojciech Kwedlo Faculty of Computer Science Technical University of Bia lystok Wiejska 45a, 15-351 Bia lystok, Poland wkwedlo@ii.pb.bialystok.pl

More information

Jarek Szlichta

Jarek Szlichta Jarek Szlichta http://data.science.uoit.ca/ Approximate terminology, though there is some overlap: Data(base) operations Executing specific operations or queries over data Data mining Looking for patterns

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

DATA MINING Introductory and Advanced Topics Part I

DATA MINING Introductory and Advanced Topics Part I DATA MINING Introductory and Advanced Topics Part I Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Companion slides for the text by Dr. M.H.Dunham, Data

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Information Retrieval and Organisation Chapter 16 Flat Clustering Dell Zhang Birkbeck, University of London What Is Text Clustering? Text Clustering = Grouping a set of documents into classes of similar

More information

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013 Voronoi Region K-means method for Signal Compression: Vector Quantization Blocks of signals: A sequence of audio. A block of image pixels. Formally: vector example: (0.2, 0.3, 0.5, 0.1) A vector quantizer

More information

CSE4334/5334 DATA MINING

CSE4334/5334 DATA MINING CSE4334/5334 DATA MINING Lecture 4: Classification (1) CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai Li (Slides courtesy

More information

A Steady-State Genetic Algorithm for Traveling Salesman Problem with Pickup and Delivery

A Steady-State Genetic Algorithm for Traveling Salesman Problem with Pickup and Delivery A Steady-State Genetic Algorithm for Traveling Salesman Problem with Pickup and Delivery Monika Sharma 1, Deepak Sharma 2 1 Research Scholar Department of Computer Science and Engineering, NNSS SGI Samalkha,

More information

INTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá

INTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá INTRODUCTION TO DATA MINING Daniel Rodríguez, University of Alcalá Outline Knowledge Discovery in Datasets Model Representation Types of models Supervised Unsupervised Evaluation (Acknowledgement: Jesús

More information

Data Mining Concepts

Data Mining Concepts Data Mining Concepts Outline Data Mining Data Warehousing Knowledge Discovery in Databases (KDD) Goals of Data Mining and Knowledge Discovery Association Rules Additional Data Mining Algorithms Sequential

More information

Introduction to Machine Learning. Xiaojin Zhu

Introduction to Machine Learning. Xiaojin Zhu Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006

More information

Accelerated Machine Learning Algorithms in Python

Accelerated Machine Learning Algorithms in Python Accelerated Machine Learning Algorithms in Python Patrick Reilly, Leiming Yu, David Kaeli reilly.pa@husky.neu.edu Northeastern University Computer Architecture Research Lab Outline Motivation and Goals

More information

Data Mining and Hypothesis Refinement Using a Multi-Tiered Genetic Algorithm

Data Mining and Hypothesis Refinement Using a Multi-Tiered Genetic Algorithm Data Mining and Hypothesis Refinement Using a Multi-Tiered Genetic Algorithm Christopher M. Taylor and Arvin Agah Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence,

More information

Study on the Application Analysis and Future Development of Data Mining Technology

Study on the Application Analysis and Future Development of Data Mining Technology Study on the Application Analysis and Future Development of Data Mining Technology Ge ZHU 1, Feng LIN 2,* 1 Department of Information Science and Technology, Heilongjiang University, Harbin 150080, China

More information

A Comparative study of Clustering Algorithms using MapReduce in Hadoop

A Comparative study of Clustering Algorithms using MapReduce in Hadoop A Comparative study of Clustering Algorithms using MapReduce in Hadoop Dweepna Garg 1, Khushboo Trivedi 2, B.B.Panchal 3 1 Department of Computer Science and Engineering, Parul Institute of Engineering

More information

A NOVEL APPROACH FOR PRIORTIZATION OF OPTIMIZED TEST CASES

A NOVEL APPROACH FOR PRIORTIZATION OF OPTIMIZED TEST CASES A NOVEL APPROACH FOR PRIORTIZATION OF OPTIMIZED TEST CASES Abhishek Singhal Amity School of Engineering and Technology Amity University Noida, India asinghal1@amity.edu Swati Chandna Amity School of Engineering

More information

Clustering (Basic concepts and Algorithms) Entscheidungsunterstützungssysteme

Clustering (Basic concepts and Algorithms) Entscheidungsunterstützungssysteme Clustering (Basic concepts and Algorithms) Entscheidungsunterstützungssysteme Why do we need to find similarity? Similarity underlies many data science methods and solutions to business problems. Some

More information

Genetic Fourier Descriptor for the Detection of Rotational Symmetry

Genetic Fourier Descriptor for the Detection of Rotational Symmetry 1 Genetic Fourier Descriptor for the Detection of Rotational Symmetry Raymond K. K. Yip Department of Information and Applied Technology, Hong Kong Institute of Education 10 Lo Ping Road, Tai Po, New Territories,

More information

ARTIFICIAL INTELLIGENCE (CSCU9YE ) LECTURE 5: EVOLUTIONARY ALGORITHMS

ARTIFICIAL INTELLIGENCE (CSCU9YE ) LECTURE 5: EVOLUTIONARY ALGORITHMS ARTIFICIAL INTELLIGENCE (CSCU9YE ) LECTURE 5: EVOLUTIONARY ALGORITHMS Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Optimisation problems Optimisation & search Two Examples The knapsack problem

More information

UGA: A New Genetic Algorithm-Based Classification Method for Uncertain Data

UGA: A New Genetic Algorithm-Based Classification Method for Uncertain Data Middle-East Journal of Scientific Research 20 (10): 1207-1212, 2014 ISSN 1990-9233 IDOSI Publications, 2014 DOI: 10.5829/idosi.mejsr.2014.20.10.12512 UGA: A New Genetic Algorithm-Based Classification Method

More information

Role of Genetic Algorithm in Routing for Large Network

Role of Genetic Algorithm in Routing for Large Network Role of Genetic Algorithm in Routing for Large Network *Mr. Kuldeep Kumar, Computer Programmer, Krishi Vigyan Kendra, CCS Haryana Agriculture University, Hisar. Haryana, India verma1.kuldeep@gmail.com

More information

March 19, Heuristics for Optimization. Outline. Problem formulation. Genetic algorithms

March 19, Heuristics for Optimization. Outline. Problem formulation. Genetic algorithms Olga Galinina olga.galinina@tut.fi ELT-53656 Network Analysis and Dimensioning II Department of Electronics and Communications Engineering Tampere University of Technology, Tampere, Finland March 19, 2014

More information

Genetic Algorithms. Genetic Algorithms

Genetic Algorithms. Genetic Algorithms A biological analogy for optimization problems Bit encoding, models as strings Reproduction and mutation -> natural selection Pseudo-code for a simple genetic algorithm The goal of genetic algorithms (GA):

More information

CS5401 FS2015 Exam 1 Key

CS5401 FS2015 Exam 1 Key CS5401 FS2015 Exam 1 Key This is a closed-book, closed-notes exam. The only items you are allowed to use are writing implements. Mark each sheet of paper you use with your name and the string cs5401fs2015

More information