Unsupervised Learning and Clustering

Similar documents
Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

Unsupervised Learning

Unsupervised Learning and Clustering

Machine Learning: Algorithms and Applications

Machine Learning. Topic 6: Clustering

Clustering. A. Bellaachia Page: 1

Hierarchical clustering for gene expression data analysis

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Hierarchical agglomerative. Cluster Analysis. Christine Siedle Clustering 1

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

cos(a, b) = at b a b. To get a distance measure, subtract the cosine similarity from one. dist(a, b) =1 cos(a, b)

Machine Learning. K-means Algorithm

LECTURE : MANIFOLD LEARNING

Biostatistics 615/815

Classifier Selection Based on Data Complexity Measures *

Graph-based Clustering

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

APPLIED MACHINE LEARNING

Fuzzy Logic Based RS Image Classification Using Maximum Likelihood and Mahalanobis Distance Classifiers

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Multi-stable Perception. Necker Cube

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Three supervised learning methods on pen digits character recognition dataset

K-means and Hierarchical Clustering

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

SVM-based Learning for Multiple Model Estimation

CS 534: Computer Vision Model Fitting

Gas Identi cation by Using a Cluster-k-Nearest-Neighbor

Image Alignment CSC 767

Feature Reduction and Selection

Announcements. Supervised Learning

EXTENDED BIC CRITERION FOR MODEL SELECTION

A Clustering Algorithm for Chinese Adjectives and Nouns 1

This excerpt from. Foundations of Statistical Natural Language Processing. Christopher D. Manning and Hinrich Schütze The MIT Press.

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Basic Pattern Recognition. Pattern Recognition Main Components. Introduction to PR. PR Example. Introduction to Pattern Recognition.

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

A Statistical Model Selection Strategy Applied to Neural Networks

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Survey of Cluster Analysis and its Various Aspects

Data Mining MTAT (4AP = 6EAP)

Hybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance

Life Tables (Times) Summary. Sample StatFolio: lifetable times.sgp

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

Applying EM Algorithm for Segmentation of Textured Images

UNIVERSITY OF JOENSUU COMPUTER SCIENCE DISSERTATIONS 11. Mantao Xu. K-means Based Clustering and Context Quantization ACADEMIC DISSERTATION

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Clustering algorithms and validity measures

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

A Multi-step Strategy for Shape Similarity Search In Kamon Image Database

Topics. Clustering. Unsupervised vs. Supervised. Vehicle Example. Vehicle Clusters Advanced Algorithmics

12. Segmentation. Computer Engineering, i Sejong University. Dongil Han

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Information Retrieval

A Deflected Grid-based Algorithm for Clustering Analysis

Optimizing Document Scoring for Query Retrieval

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers

Fitting: Deformable contours April 26 th, 2018

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

Cluster Ensemble and Its Applications in Gene Expression Analysis

A Semi-parametric Regression Model to Estimate Variability of NO 2

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

Application of Clustering Algorithm in Big Data Sample Set Optimization

Associative Based Classification Algorithm For Diabetes Disease Prediction

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

KOHONEN'S SELF ORGANIZING NETWORKS WITH "CONSCIENCE"

A Simple Methodology for Database Clustering. Hao Tang 12 Guangdong University of Technology, Guangdong, , China

1. Introduction. Abstract

Active Contours/Snakes

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Lecture 4: Principal components

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. *, NO. *, Dictionary Pair Learning on Grassmann Manifolds for Image Denoising

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

Problem Set 3 Solutions

Detection of an Object by using Principal Component Analysis

Cluster Analysis of Electrical Behavior

High Dimensional Data Clustering

Algorithm To Convert A Decimal To A Fraction

An Entropy-Based Approach to Integrated Information Needs Assessment

Understanding K-Means Non-hierarchical Clustering

TOPOGRAPHIC OBJECT RECOGNITION THROUGH SHAPE

3D FCRM MODELING IN MILES PER GALLON OF CAR

Lecture 36 of 42. Expectation Maximization (EM), Unsupervised Learning and Clustering

A Webpage Similarity Measure for Web Sessions Clustering Using Sequence Alignment

Performance Evaluation of Information Retrieval Systems

Data Mining: Model Evaluation

Evaluation of Space Partitioning Data Structures for Nonlinear Mapping

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

NGPM -- A NSGA-II Program in Matlab

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

Ensemble Fuzzy Clustering using Cumulative Aggregation on Random Projections

Transcription:

Unsupervsed Learnng and Clusterng

Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned on small set of labeled samples and tuned on a large unlabeled set 3. Tran on large unlabeled set and use supervson on groupngs found 4. Characterstcs of patterns may change wth tme 5. Unsupervsed methods can be used to fnd useful features 6. Exploratory data analyss may dscover presence of sgnfcant subclasses affectng desgn

Mxture Denstes and Identfablty Samples come from c classes Prors are nown Pω ) Forms of the class-condtonal are nown Values for ther parameters are unnown Probablty densty functon of samples s: px, θ ) c px ω, θ ) P ω )

Gradent Ascent for Mxtures Mxture densty: Lelhood of observed samples: Log-lelhood: Gradent w.r.t. θ : MLE must satsfy: ) ), x ) x, c P p p ω θ ω θ n x p D p ) ) θ θ n x p l ) ln θ c n P x p x p l ) ), ) ω θ ω θ θ θ 0 ) ˆ, ln ˆ), n x p x P θ ω θ ω θ

Gaussan Mxture Unnown mean vectors, yelds Leadng to an teratve scheme for mprovng estmates t n n x P x x P ) ˆ,.. ˆ ˆ where ˆ), ˆ), ˆ c µ µ µ µ ω µ ω µ )) ˆ, ) ˆ + n x x P µ ω µ )) ˆ, n x P µ ω

-means clusterng Gaussan case wth all parameters unnown leads to a formulaton: begn ntalze n, c, µ,µ 2,..,µ c do classfy n samples accordng to nearest µ recompute µ untl no change n µ end

-means clusterng wth one feature One-dmensonal example Sx startng ponts lead local maxma whereas two for both of whch µ 0) µ 2 0) lead to a saddle pont

-means clusterng wth two features Two-dmensonal example There are three means and there are three steps n the teraton. Vorono tesselatons based on means are shown

Data Descrpton and Clusterng

Data Descrpton Learnng the structure of multdmensonal patterns from a set of unlabelled samples Form clouds of ponts n d-dmensonal space If data were from a sngle normal dstrbuton, mean and covarance metrc would suffce as a descrpton

Data sets havng dentcal statstcs upto second order,.e., same µ andσ

Mxture of c normal dstrbutons approach Estmatng parameters s non-trval Assumpton of partcular parametrc forms can lead to poor or meanngless results Alternatvely use nonparametrc approach: peas or modes can ndcate clusters If goal s to fnd sub-classes use clusterng procedures

Smlarty Measures Two Issues. How to measure smlarty between samples? 2. How to evaluate parttonng? If dstance s a good measure of dssmlarty dstance between samples n same cluster must be smaller than dstance between samples n dfferent clusters Two samples belong to the same cluster f dstance between them s less than a threshold d 0 Dstance threshold affects number and sze of clusters

Smlarty Measures for Clusterng Mnows Metrc d ' q dx, x' ) x x Metrc based on data tself: Mahanalobs dstance Angle between vectors as smlarty /q q 2 s Eucldean, q s Manhattan or cty bloc metrc s x, x') t x x x x' ' Cosne of angle between vectors s nvarant to rotaton and dlaton but not translaton and general lnear transformatons

Bnary Feature Smlarty Measures s x, x') ' x x t x x' Numerator no of attrbutes possessed by both x and x Denomnator x t xx t x ) /2 s geometrc mean of no of attrbutes possessed by x and x xt x ' s x, x') Fracton of attrbutes shared d s x, x') t x x + t ' x x x' t x ' t x x ' Tanmoto coeffcent: Rato of number of shared attrbutes to number possessed by x or x

Issues n Choce of Smlarty Functon Tanmoto coeffcent used n Informaton Retreval and Taxonomy Fundamental ssues n Measurement Theory Combnng features s trcy: nches versus meters Nomnal, ordnal, nterval and rato scales

Crteron Functons for Clusterng Sum of squared errors crteron Mean of samples n D m 2 x x D J e c x D x m 2 Crteron s not best when two clusters are of unequal sze Sutable when they are compact clouds

Related Mnmum Varance Crtera J e 2 where c s n s n 2 x D x' D x x' 2 Can be replaced by other smlarty functon sx,x ) Optmal partton extremzes the crteron functon

Scatter Crtera Derved from Scatter Matrces Trace crteron Determnant Crteron Invarant Crtera

Herarchcal Clusterng

Dendrogram

Agglomeratve Algorthm

Nearest Neghbor Algorthm

Farthest Neghbor Algorthm

How to determne nearest clusters