Supervised vs.unsupervised Learning

Similar documents
Figure (5) Kohonen Self-Organized Map

11/14/2010 Intelligent Systems and Soft Computing 1

Artificial Neural Networks Unsupervised learning: SOM

Unsupervised Learning

Unsupervised learning

Data Warehousing and Machine Learning

PATTERN RECOGNITION USING NEURAL NETWORKS

Chapter 7: Competitive learning, clustering, and self-organizing maps

Self-Organizing Maps for cyclic and unbounded graphs

Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Preprocessing DWML, /33

UNIT 9C Randomness in Computation: Cellular Automata Principles of Computing, Carnegie Mellon University

Lecture-17: Clustering with K-Means (Contd: DT + Random Forest)

Neural networks for variable star classification

Introduction to Artificial Intelligence

A Population Based Convergence Criterion for Self-Organizing Maps

SOMSN: An Effective Self Organizing Map for Clustering of Social Networks

Controlling the spread of dynamic self-organising maps

Unsupervised Learning and Clustering

Centroid Neural Network based clustering technique using competetive learning

Function approximation using RBF network. 10 basis functions and 25 data points.

Unsupervised Learning

Clustering & Dimensionality Reduction. 273A Intro Machine Learning

Unsupervised Learning : Clustering

Data Mining. Moustafa ElBadry. A thesis submitted in fulfillment of the requirements for the degree of Bachelor of Arts in Mathematics

Exploratory data analysis for microarrays

Unsupervised Learning. Pantelis P. Analytis. Introduction. Finding structure in graphs. Clustering analysis. Dimensionality reduction.

Slide07 Haykin Chapter 9: Self-Organizing Maps

Machine Learning Reliability Techniques for Composite Materials in Structural Applications.

An Unsupervised Approach for Combining Scores of Outlier Detection Techniques, Based on Similarity Measures

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION

CHAPTER 4: CLUSTER ANALYSIS

CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION

A Dendrogram. Bioinformatics (Lec 17)

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)

Unsupervised Learning

Exploratory Data Analysis using Self-Organizing Maps. Madhumanti Ray

Intro to Artificial Intelligence

INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering

CSC321: Neural Networks. Lecture 13: Learning without a teacher: Autoencoders and Principal Components Analysis. Geoffrey Hinton

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter

Machine Learning : Clustering, Self-Organizing Maps

Network Traffic Measurements and Analysis

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018

Simulation of WSN in NetSim Clustering using Self-Organizing Map Neural Network

A Topography-Preserving Latent Variable Model with Learning Metrics

Seismic regionalization based on an artificial neural network

arxiv: v1 [physics.data-an] 27 Sep 2007

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms.

Two-step Modified SOM for Parallel Calculation

Data Mining. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of Computer Science

CHAPTER 3 TUMOR DETECTION BASED ON NEURO-FUZZY TECHNIQUE

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Tecniche di Progettazione: Design Patterns

ECG782: Multidimensional Digital Signal Processing

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Simulation of WSN in NetSim Clustering using Self-Organizing Map Neural Network

MURDOCH RESEARCH REPOSITORY

Lecture Topic Projects

Lecture on Modeling Tools for Clustering & Regression

IMAGE CLASSIFICATION USING COMPETITIVE NEURAL NETWORKS

Data Mining. Kohonen Networks. Data Mining Course: Sharif University of Technology 1

GST Networks: Learning Emergent Spatiotemporal Correlations1

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte

The Traveling Salesman

Machine Learning (CSE 446): Perceptron

Unsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Dr. Qadri Hamarsheh figure, plotsomnd(net) figure, plotsomplanes(net) figure, plotsomhits(net,x) figure, plotsompos(net,x) Example 2: iris_dataset:

CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION

Mineral Exploation Using Neural Netowrks

Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic

Case-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.

CS 188: Artificial Intelligence Fall 2008

Feature weighting using particle swarm optimization for learning vector quantization classifier

Clustering in Data Mining

Graph Adjacency Matrix Automata Joshua Abbott, Phyllis Z. Chinn, Tyler Evans, Allen J. Stewart Humboldt State University, Arcata, California

Clustering Lecture 4: Density-based Methods

Associative Cellular Learning Automata and its Applications

We have written lots of code so far It has all been inside of the main() method What about a big program? The main() method is going to get really

Unsupervised Learning and Clustering

What is a receptive field? Why a sensory neuron has such particular RF How a RF was developed?

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Clustering

Computer Vision. Exercise Session 10 Image Categorization

Artificial Intelligence. Programming Styles

SOM+EOF for Finding Missing Values

A Comparative Study of Conventional and Neural Network Classification of Multispectral Data

Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimensional Spatial Data

2D image segmentation based on spatial coherence

Methods for Intelligent Systems

A Study on Clustering Method by Self-Organizing Map and Information Criteria

MTTTS17 Dimensionality Reduction and Visualization. Spring 2018 Jaakko Peltonen. Lecture 11: Neighbor Embedding Methods continued

The Hitchhiker s Guide to TensorFlow:

Unsupervised Learning

Planning rearrangements in a blocks world with the Neurosolver

Ensemble methods in machine learning. Example. Neural networks. Neural networks

A method for comparing self-organizing maps: case studies of banking and linguistic data

Neural Networks (pp )

1 Case study of SVM (Rob)

Bioinformatics - Lecture 07

Cluster Analysis: Agglomerate Hierarchical Clustering

Transcription:

Supervised vs.unsupervised Learning In supervised learning we train algorithms with predefined concepts and functions based on labeled data D = { ( x, y ) x X, y {yes,no}. In unsupervised learning we are given a set of instances X and we let the algorithms discover interesting properties of this set. Most unsupervised learning algorithms are based on the idea of discovering similarities between elements in the set X.

The k-means Algorithm The goal is to find k clusters of similar elements in the instance set X. Suppose that we have n instances x 0, x 1,..., x n-1 X; let k < n; let m i be the mean of the instances in cluster i. Then we say that: x is in cluster i if min i x m i, where 0 i < k and x X. Here x m i is the Euclidean distance between x and m i. This suggests the following procedure for finding the k means: Make initial guesses for the means m 1,..., m k Until there are no changes in any mean Use the estimated means to classify the examples into k clusters For i from 1 to k m i mean of all of the examples for cluster i

The k-means Algorithm Iterations of the k-means algorithm with k = 2.

The k-means Algorithm Pros: Fast convergence Conceptually simple Cons: Very sensitive to the choice of numbers of clusters k Self-Organizing Maps

Self-Organizing Map: SOM A feed-forward neural network architecture based on competitive learning invented by Teuvo Kohonen in 1981. Does not depend on a priori selection of number of clusters to search for will find the appropriate number of clusters for given the set of instances. Learning is slower than in k-means clustering - competitive learning

Self-Organization and Learning Self-organization refers to a process in which the internal organization of a system increases automatically without being guided or managed by an outside source. This process is due to local interaction with simple rules. Local interaction gives rise to global structure. We can interpret emerging global structures as learned structures. Learned structures appear as clusters of similar objects. Complexity : Life at the Edge of Chaos, Roger Lewin, University Of Chicago Press; 2nd edition, 2000

Game of Life Most famous example of selforganization - Game of Life Simple local rules: Any live cell with fewer than two live neighbours dies, as if caused by underpopulation. Any live cell with two or three live neighbours lives on to the next generation. Any live cell with more than three live neighbors dies, as if by overcrowding. Any dead cell with exactly three live neighbors becomes a live cell, as if by reproduction. Source: http://en.wikipedia.org/wiki/conway's_game_of_life

SOM Architecture Computational Layer m i x Input Layer This SOM has a feed-forward structure with a single computational layer arranged in rows and columns. Each neuron is fully connected to the input node in the input layer. The goal is to organize the neurons in the computational layer into clusters/regions associated with patterns in the instance set X.

Self-Organizing Maps Data Table Grid of Neurons Visualization Algorithm: Repeat until Done For each row in Data Table Do Find the neuron that best describes the row. Make that neuron look more like the row. Smooth the immediate neighborhood of that neuron. End For End Repeat

SOMs Sample the Data Space Given some distribution in the data space, SOM will try to construct a sample that looks like it was drawn from the same distribution. Algorithm: Repeat until Done For each row in Data Table Do Find the neuron that best describes the row. Make that neuron look more like the row. Smooth the immediate neighborhood of that neuron. End For End Repeat Image source: www.peltarion.com

SOM Visualization Visualization of Seven clusters using SOM

Training, Evaluation and Prediction Unsupervised Learning Since in unsupervised learning we don t have labeled data, we cannot proceed as in the case of supervised learning. However, we can take a look at the quality of the clusters in terms of cluster density. Also, we can use trained clusters to assign unseen instances to clusters.

Training & Evaluation The variance of the distance of each point in a cluster from the centroid or mean is defined as follows: E k = n " 1! k i= 0 x ki where n k is the number of points in cluster k, x ki is the ith point in cluster k, and m k is the centroid or mean of cluster k. Dense clusters have small variances, therefore, the magnitude of the variance is an indication of the quality of a cluster. This is also often called the quantization error SOM tries to reduce this quantization error as much as possible. " m k 2