Self-Organizing Maps (SOM)

Size: px
Start display at page:

Download "Self-Organizing Maps (SOM)"

Transcription

1 Overview (SOM) Basics Sequential Training (On-Line Learning) Batch SOM Visualizing the SOM - SOM Grid - Music Description Map (MDM) - Bar Plots and Chernoff's Faces - U-Matrix and Distance Matrix - Smoothed Data Histogram (SDH) - Component Planes Univ.-Ass. Dr. Markus Schedl Department of Computational Perception Johannes Kepler University Linz Growing Hierarchical SOM Aligned SOM markus.schedl@jku.at Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 2 Self-Organizing Map (SOM): Basics SOM: "neural network" [Kohonen, 1982], [Kohonen, 2001] SOM ~ k-means clustering + topology preservation preservation of non-linear relationship between data items Basic Architecture: Map: 2-dimensional array of interconnected units ("neurons") connections define fixed topology "neighborhood" units represent cluster centers (prototypes, "model vectors", "weight vector", "reference vector") Different Topologies / Grid Structures Interpretation: clustering with topology constraints (similar data items should be placed close to each other on the map) mapping from data/feature/input space to low-dim. visualization space + tighter relationship between clusters + more connections + grid structure fits Gaussian structure in neighborhood kernel calculation (centroids of neighboring map units are equidistant) + easier to implement diagonally neighboring map units do not perfectly fit to Gaussian neighborhood function 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 4

2 An Application of the Self-Organizing Map: The neptune Interface On-line Learning: The Online Training Algorithm Input: map of units u i with model vectors m i ("codebook") training instances X = {x i } a similarity measure sim(.,.) between data items (e.g., Euclidean distance) parameters: α(t) (learning rate [0..1]) and a neighborhood kernel function with parameter r(t) ( neighborhood radius ), 2 2 e.g., pseudo-gaussian u ( t) = exp( d r( t) ) (d ij = map distance btw. u i, u j ) Online SOM Training Algorithm (one possible variant): ij Initialize each unit (model vector) m i to represent a randomly selected data item Loop over time steps t, until convergence: 1. Randomly select an example x 2. Find the winning unit (best matching unit) u c with m c = max i (sim(m i,x)) 3. Adapt model vectors of all units as m i (t +1) = m i (t)+ α(t) u ic (t) [x m i (t)] 4. Update (decrease) training parameters α(t), r(t) ij 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 6 Off-line Learning: The Batch SOM Algorithm SOM: Illustration Input: map of units u i with model vectors m i training instances X = {x i } a similarity measure between examples (e.g., Euclidean distance) a neighborhood kernel function with parameter r(t) ( neighborhood radius ), 2 2 e.g., pseudo-gaussian u ( t) = exp( d r( t) ) (d ij = map distance btwn. u i, u j ) Batch SOM Training Algorithm (one possible variant): ij Initialize each unit (model vector) m i to represent a randomly selected data item Loop over time steps t, until convergence: 1. Determine the best matching unit u c(i) for each data item x i (i.e., assign each instance to its most similar model vector) Voronoi set 2. Update each model vector m i to better fit the data items assigned to it and the data in its neighborhood: u ( ) ( t) k ic k x k m i ( t + 1) = u ( t) 3. Update (decrease) neighborhood radius r(t) ij k ic( k ) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 8

3 SOM: Illustration Initialization of the Model Vectors Random Initialization: - random values in same range as X (between min and max of each dimension) - randomly select data items from X and assign them to model vectors m i + fast mapping not consistent for different runs each data point (example) x uniquely belongs to a unit (the BMU of x) relationship between units: neighboring units cover similar data items non-uniform distances between model vectors, uniform distances in visualization "interpolation units" (units with no data associated) are possible Linear Initialization: perform Eigendecomposition of autocorrelation matrix of X PCA top 2 Eigenvectors (with largest Eigenvalues) span a 2-dimensional subspace initialize model vectors along these Eigenvectors predefined linear mapping to start with + mapping consistent for different runs (up to rotation / mirroring) computationally more complex 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 10 Example: WebSOM Project Example: Browsing Music Collections Support in Browsing (Potentially Huge) Data Sets [Kaski et al., 1998] ViSMuC by Schedl, Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 12

4 Example: Browsing Music Collections (II) Visualizing the SOM Visualizing attribute distributions on top of a learned SOM: Component Planes: visualize feature values of model vectors associated with the map units (or averaged feature values over all instances covered by a unit) Bar Charts or Chernoff's Faces: visualize all dimensions of model vectors for each map unit in one plot [Vesanto, 1999], [Vesanto, 2002] PlaySOM, TU Wien, Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 15 Visualizing the SOM Visualizing Attribute Distributions: Component Planes Visualizing the data distribution on top of a learned SOM: Learned Map Component Planes SOM-Grid: each data item is displayed within its BMU Music Description Map (MDM): aggregates similar map units and add descriptive labels [Knees et al., 2006] Horse Zebra Cow Tiger Lion Fox Dog Wolf Small Medium Big 2-Legs 4-Legs Hair Hooves Mane U-Matrix: visualizes distances between units (via color) Cat Feathers Hunt Run Fly Distance Matrix: visualize aggregated distances of model vectors to all neighboring units [Vesanto, 1999], [Vesanto, 2002] Duck Goose Dove Chicken Owl Hawk Eagle Swim Smoothed Data Histogram (SDH): visualizes (smoothed) density of data items in an area [Pampalk et al., 2002] - explain mapping (labeling) - make correlations between attributes visible 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 17

5 Visualizing Attribute Distributions: Bar Plots each attribute value (dimension in data space) is displayed via a bar in a d-dimensional bar chart visualization for each map unit Visualizing Attribute Distributions: Chernoff's Faces psychologically motivated visualization method (people can quickly grasp a face's expression) each attribute value (dimension in data space) is mapped to a specific property of the Chernoff face (e.g., mouth's contour, height/width of face, ear's slope, ) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 20 Visualizing the SOM SOM Grid for data set C103a: co-occurrences of artist names Visualizing the data distribution on top of a learned SOM: SOM-Grid: each data item is displayed within its BMU Music Description Map (MDM): aggregates similar map units and add descriptive labels [Knees et al., 2006] U-Matrix: visualizes distances between units (via color) Distance Matrix: visualize aggregated distances of model vectors to all neighboring units [Vesanto, 1999], [Vesanto, 2002] Smoothed Data Histogram (SDH): visualizes (smoothed) density of data items in an area [Pampalk et al., 2002] 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 22

6 SOM Grid for data set C103a: co-occurrences of artist names (II) SOM Grid for larger data set 2572 songs, 7 genres, features: MFCCs 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 24 SOM Grid for larger data set: Aggregate data items using metadata metadata available summarize items w.r.t. properties (e.g., genre) Music Description Map (MDM) [Knees et al., 2006] - extension of the simple SOM grid - describes regions of the map by metadata - aggregates "similar" neighboring map units via region growing algorithm loss of information: Dance? 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 26

7 MDM (II): Labeling Map Units MDM (II): Connecting Similar Map Units determining the goodness G 2 t,u of a term t for map unit u according to [Lagus, Kaski, 1999]: 1. sort all units w.r.t. G 2 -values of contained terms U u k A0 if Manhattan distance between units u and k < threshold r 0 u i A if Manhattan distance between units u and i: r 0 < d(u,i) < r 1 1 G F 2 t, u t, u = 2 ( u Ft, k ) k A 0 u F i A t, i 1 f a a, u tft, fa, u = v a a tf #tracks of artist a on unit u v, a term frequency of term t for artist a 2. remove highest ranked unit u U, find similarly labeled units among u's neighbors if cosine similarity between label vectors of map unit u and its neighbors i < threshold θ, aggregate u and i 3. goto 2 filter all terms t with G 2 t,u < 0.01 cut-off of 30 keywords per map unit [Lagus, Kaski, 1999] 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 28 Visualizing Data Distributions: U-Matrix and Distance Matrix U-Matrix: visualizes distances between units (model vectors) Distance Matrix: visualizes difference of a unit's model vector to all neighboring units' model vectors Visualizing Data Distributions: U-Matrix and SDH Two methods for visualizing data on top of a learned SOM: U-Matrix: visualizes distances between units (via color) Smoothed Data Histogram (SDH): visualizes (smoothed) density of examples in an area of the map 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 30

8 Smoothed Data Histograms (SDH) [Pampalk et al., 2002] SOM and SDH: An Example display smoothed density of data items associated with areas of the map reveal clusters in the data many pieces associated with a unit cluster center Idea for smoothing / density estimation: voting matrix whose size equals size of SOM data items vote for a number N of best-matching units best-matching unit gets N points, 2nd best gets N-1 points, N-th best gets 1 point, all others get 0 points (N is parameter, spread ) the distribution of votes is visualized over the entire map, e.g., via a color map (interpolated voting matrix for smoothing) Data Space Visualization Space N=1 N=2 N=5 N=7 N=10 N= Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 32 Smoothed Data Histograms SOM and SDH A Sample Application neptune Be aware of influence of color scale on perception! Input: music collection (digital audio files) calculate audio features for each track, e.g. rhythmic [Pampalk, Islands of Music: Analysis, Organization, and Visualization of Music Archives, Diploma Thesis 2001] timbral [Mandel & Ellis, Song-Level Features and Support Vector Machines for Music Classification, ISMIR 2005] train a SOM on audio features calculate an SDH on the SOM visualize SDH in 3D using smoothed voting matrix of SDH as height values build a game-like user interface to explore the user s (or someone else s) music collection Matlab implementations of SOMs and SDHs (Toolboxes): (Google: SOM Toolbox ) (Google: SDH Toolbox ) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 34

9 neptune (2) neptune (3) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 36 neptune (4) neptune (5) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 38

10 Hierarchical Structuring: The Growing Hierarchical Self-Organizing Map (GHSOM) Flat SOM: [Dittenbach et al., 2002] The GHSOM Algorithm Start with 1 unit to expand (= mean of data), level 0 Loop until no more units to expand 1. For each unit to expand create new 2x2 SOM (initialize orientation) 2. Train SOM on data assigned to parent unit 3. Decision 1: Insert new row or column? If yes: insert new row/column and goto 2 4. Decision 2: hierarchically expand units of map? If yes: add units to expand list Hierarchical SOM: Decision 1: Insert new row or column if mean quantization error > threshold (i.e., map does not represent the data well); insert new row or column between unit with highest quantization error and adjacent unit with largest distance Decision 2: Expand unit if quantization error of unit > threshold (i.e., unit does not represent its associated data items well) Parameters: same as SOM (except no. of units) + 2 thresholds τ 1, τ Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 41 The GHSOM Algorithm: Decisions 1 and 2 Mean quantization error of unit u i : Voronoi set V i of unit u i : all data items whose BMU is u i 1 MQE = xk m i i Vi k V i where V i = { k uc( k ) = ui} quantifies how well a unit i approximates its data items Mean quantization error of a SOM: MMQE V = i i n MQE i quantifies how well a SOM approximates the data items The GHSOM Algorithm: Decisions 1 (enlarge map) and 2 (insert new map) Decision 1 : Insert new row or column if MMQE > τ 1 MQE 0 where MQE 0 is the MQE of a virtual unit m 0 representing the mean of all instances covered by the parent unit: m 0 = x i n i Decision 2 : Expand unit if MQE i > τ 2 MQE 0 * where MQE 0* is the mean quantization error of the whole dataset with respect to the virtual unit located in the center of the whole dataset (in contrast to MQE 0, which is the mean quantization error of the data items in the respective sub-branch of the GHSOM) Generally: τ 1, τ 2 are chosen such that 1 > τ 1 >> τ 2 > Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 43

11 The GHSOM Algorithm: Preservation of Orientation Problem: maps of descendants of a unit u i could have arbitrary orientation no visible relationship between different sub-branches (other than common parent map) Solution: enforce/encourage a specific orientation of the sub-layer SOMs via initialization initialize the model vectors of the 2x2 SOMs such that they correspond to the orientation of the parent map for example: initialize the 4 model vectors with the means of the parent vector and each of its 4 immediate neighbors for border units: extrapolate "virtual" units. Example: if u i is located on the left border and the unit to its right is u r, create virtual left neighbor u l with m l = m i + (m i m r ) Exercise: How could the initialization function for the codebook of a new sublevel SOM expressed as weighted parent unit(s' neighbors) look like? Hierarchical Map GHSOM on Animals Hierarchical Component Planes 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 45 GHSOM + SDH: deeptune GHSOM + SDH: deeptune (II) Different Hierarchy Levels 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 47

12 Visualizing Effects of Changes in Data Definition: Aligned SOMs [Pampalk et al., 2003] Basic concepts: Goal: understand relationship between different ways of representing the same data layers of mutually constrained SOMs (i.e., a stack of SOMs) each layer trained on slightly different data space / view of the data (i.e., different dimensions or distance definitions), but same data items trained so that all layers have same orientation constraints between layers to enforce smooth transitions between views p min Aligned SOMs: The Basic Architecture p max Parameter Values (define different views of the data) Stack of SOMs Use: exploratory analysis of alternative data representations visualize changes in the inherent structure of the data in response to changes in features, relative feature weights, different ways of normalizing features, different similarity functions,... navigation through alternative data spaces 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 49 Distance between layers (relative to distance between units in same layer) E.g., intra-som distance between neighboring units = 1 inter-som distance "between" same map unit = 1/ Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 50 Initialize all layers Loop Randomly select training instance x and layer l Find best matching unit for x in l Adapt neighborhood of best matching unit (intra- and inter-layer neighborhood) Neighborhood: Aligned SOM: Training (Online version, simplified) within layer between layers 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 51 Aligned SOM: On-line Learning Input: map of units u li with model vectors m li ("codebook"), l layer training instances X = {x i } a similarity measure sim(.,.) between data items (e.g., Euclidean distance) parameters: α(t) (learning rate [0..1]) and a neighborhood kernel function with parameter r(t) ( neighborhood radius ), e.g., pseudo-gaussian 2 2 (d ij = map distance btw. u li, u kj ) uij ( t) = exp( dij r( t) ) Online SOM Training Algorithm: Initialize each unit (model vector) m li to represent a randomly selected data item (features weighted according to layer-specific weights, e.g., from 1:0 to 0:1) Loop over time steps t, until convergence: 1. Randomly select an example x and a layer l; apply weighting according to view/data space of l x l 2. Find the winning unit (best matching unit) u c with m c = max i (sim(m li,x l )) 3. Adapt model vectors of all units in all layers as m li (t +1) = m li (t) + α(t) u ic (t) [x l m li (t)] 4. Update (decrease) training parameters α(t), r(t) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 52

13 Aligned SOM on Animals Aligned SOM Demos Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 54 Literature SOM: [Kohonen, 1982]: Kohonen, T. Self-Organizing Formation of Topologically Correct Feature Maps. Biological Cybernetics, 43: [Kohonen, 2001]: Kohonen, T., volume 30 of Springer Series in Information Sciences. Springer, Berlin, Germany, 3rd edition. [Vesanto, 1999]: Vesanto, J. SOM-Based Data Visualization Methods. Intelligent Data Analysis 3(2): [Vesanto, 2002]: Vesanto, J. Data Exploration Process Based on the Self-Organizing Map. PhD thesis, Helsinki University of Technology, Espoo, Finland. [Pampalk et al., 2002]: Pampalk, E., Rauber, A., and Merkl, D. Using Smoothed Data Histograms for Cluster Visualization in. In Proceedings of the International Conference on Artificial Neural Networks (ICANN 2002), Madrid, Spain. Springer. [Knees et al., 2006]: Knees, P., Pohle, T., Schedl, M., and Widmer, G. Automatically Describing Music on a Map. In Proceedings of the 2nd Workshop on Learning the Semantics of Audio Signals (LSAS 2008), Paris, France, June [Kaski et al., 1998]: WEBSOM of Document Collections, Neurocomputing 21, Literature (II) GHSOM: [Dittenbach et al., 2002]: Dittenbach, M., Rauber, A., and Merkl, D. Uncovering Hierarchical Structure in Data Using the Growing Hierarchical Self-Organizing Map. Neurocomputing, 48(1 4): Aligned SOM: [Pampalk et al. 2003]: Pampalk, E., Goebl, W., Widmer, G. Visualizing Changes in the Structure of Data for Exploratory Feature Selection, In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003). [Lagus, Kaski, 1999]: Keyword Selection Method for Characterising Text Document Maps, In Proceedings of the International Conference on Artificial Neural Networks (ICANN 1999), London, UK Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 56

Graph projection techniques for Self-Organizing Maps

Graph projection techniques for Self-Organizing Maps Graph projection techniques for Self-Organizing Maps Georg Pölzlbauer 1, Andreas Rauber 1, Michael Dittenbach 2 1- Vienna University of Technology - Department of Software Technology Favoritenstr. 9 11

More information

Advanced visualization techniques for Self-Organizing Maps with graph-based methods

Advanced visualization techniques for Self-Organizing Maps with graph-based methods Advanced visualization techniques for Self-Organizing Maps with graph-based methods Georg Pölzlbauer 1, Andreas Rauber 1, and Michael Dittenbach 2 1 Department of Software Technology Vienna University

More information

plan agent skeletal durative asbru design real world domain skeletal plan asbru limitation

plan agent skeletal durative asbru design real world domain skeletal plan asbru limitation LabelSOM: On the Labeling of Self-Organizing Maps Andreas Rauber Institut fur Softwaretechnik, Technische Universitat Wien Resselgasse 3/188, A{1040 Wien, Austria http://www.ifs.tuwien.ac.at/~andi Abstract

More information

Visualizing Changes in the Structure of Data for Exploratory Feature Selection

Visualizing Changes in the Structure of Data for Exploratory Feature Selection Visualizing Changes in the Structure of Data for Exploratory Feature Selection Elias Pampalk 1, Werner Goebl 1, and Gerhard Widmer 1,2 1 Austrian Research Institute for Artificial Intelligence (OeFAI)

More information

Free Projection SOM: A New Method For SOM-Based Cluster Visualization

Free Projection SOM: A New Method For SOM-Based Cluster Visualization Free Projection SOM: A New Method For SOM-Based Cluster Visualization 1 ABDEL-BADEEH M. SALEM 1, EMAD MONIER, KHALED NAGATY Computer Science Department Ain Shams University Faculty of Computer & Information

More information

A SOM-view of oilfield data: A novel vector field visualization for Self-Organizing Maps and its applications in the petroleum industry

A SOM-view of oilfield data: A novel vector field visualization for Self-Organizing Maps and its applications in the petroleum industry A SOM-view of oilfield data: A novel vector field visualization for Self-Organizing Maps and its applications in the petroleum industry Georg Pölzlbauer, Andreas Rauber (Department of Software Technology

More information

Experimental Analysis of GTM

Experimental Analysis of GTM Experimental Analysis of GTM Elias Pampalk In the past years many different data mining techniques have been developed. The goal of the seminar Kosice-Vienna is to compare some of them to determine which

More information

Ordered Vector Quantization for the Integrated Analysis of Geochemical and Geoscientific Data Sets

Ordered Vector Quantization for the Integrated Analysis of Geochemical and Geoscientific Data Sets Ordered Vector Quantization for the Integrated Analysis of Geochemical and Geoscientific Data Sets Stephen Fraser 1 & Bruce Dickson 2 We are drowning in information and starving for knowledge. Rutherford

More information

A Document-centered Approach to a Natural Language Music Search Engine

A Document-centered Approach to a Natural Language Music Search Engine A Document-centered Approach to a Natural Language Music Search Engine Peter Knees, Tim Pohle, Markus Schedl, Dominik Schnitzer, and Klaus Seyerlehner Dept. of Computational Perception, Johannes Kepler

More information

Figure (5) Kohonen Self-Organized Map

Figure (5) Kohonen Self-Organized Map 2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;

More information

Automatically Adapting the Structure of Audio Similarity Spaces

Automatically Adapting the Structure of Audio Similarity Spaces Automatically Adapting the Structure of Audio Similarity Spaces Tim Pohle 1, Peter Knees 1, Markus Schedl 1 and Gerhard Widmer 1,2 1 Department of Computational Perception Johannes Kepler University Linz,

More information

Unsupervised Learning

Unsupervised Learning Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised

More information

A visualization technique for Self-Organizing Maps with vector fields to obtain the cluster structure at desired levels of detail

A visualization technique for Self-Organizing Maps with vector fields to obtain the cluster structure at desired levels of detail A visualization technique for Self-Organizing Maps with vector fields to obtain the cluster structure at desired levels of detail Georg Pölzlbauer Department of Software Technology Vienna University of

More information

A vector field visualization technique for Self-Organizing Maps

A vector field visualization technique for Self-Organizing Maps A vector field visualization technique for Self-Organizing Maps Georg Pölzlbauer 1, Andreas Rauber 1, and Michael Dittenbach 2 1 Department of Software Technology Vienna University of Technology Favoritenstr.

More information

An Explorative, Hierarchical User Interface to Structured Music Repositories

An Explorative, Hierarchical User Interface to Structured Music Repositories An Explorative, Hierarchical User Interface to Structured Music Repositories Markus Schedl December 2003 Abstract Due to efficient compression algorithms like MP3, the number and size of digital music

More information

Investigation of Alternative Strategies and Quality Measures for Controlling the Growth Process of the Growing Hierarchical Self-Organizing Map

Investigation of Alternative Strategies and Quality Measures for Controlling the Growth Process of the Growing Hierarchical Self-Organizing Map Investigation of Alternative Strategies and Quality Measures for Controlling the Growth Process of the Growing Hierarchical Self-Organizing Map Michael Dittenbach ispaces Group ecommerce Competence Center

More information

ACCESSING MUSIC COLLECTIONS VIA REPRESENTATIVE CLUSTER PROTOTYPES IN A HIERARCHICAL ORGANIZATION SCHEME

ACCESSING MUSIC COLLECTIONS VIA REPRESENTATIVE CLUSTER PROTOTYPES IN A HIERARCHICAL ORGANIZATION SCHEME ACCESSING MUSIC COLLECTIONS VIA REPRESENTATIVE CLUSTER PROTOTYPES IN A HIERARCHICAL ORGANIZATION SCHEME Markus Dopler Markus Schedl Tim Pohle Peter Knees Department of Computational Perception Johannes

More information

Technical Report: The CoMIRVA Toolkit for Visualizing Music-Related Data

Technical Report: The CoMIRVA Toolkit for Visualizing Music-Related Data Technical Report: The CoMIRVA Toolkit for Visualizing Music-Related Data Markus Schedl Department of Computational Perception Johannes Kepler University Linz, Austria Figure 1: Our muscape application

More information

Bastian Wormuth. Version About this Manual

Bastian Wormuth. Version About this Manual Elba User Manual Table of Contents Bastian Wormuth Version 0.1 1 About this Manual...1 2 Overview...2 3 Starting Elba...3 4 Establishing the database connection... 3 5 Elba's Main Window... 5 6 Creating

More information

Visualization and Clustering of Tagged Music Data

Visualization and Clustering of Tagged Music Data Visualization and Clustering of Tagged Music Data Pascal Lehwark 1, Sebastian Risi 2, and Alfred Ultsch 3 1 Databionics Research Group, Philipps University Marburg pascal@indiji.com 2 Databionics Research

More information

5.6 Self-organizing maps (SOM) [Book, Sect. 10.3]

5.6 Self-organizing maps (SOM) [Book, Sect. 10.3] Ch.5 Classification and Clustering 5.6 Self-organizing maps (SOM) [Book, Sect. 10.3] The self-organizing map (SOM) method, introduced by Kohonen (1982, 2001), approximates a dataset in multidimensional

More information

USING MUTUAL PROXIMITY TO IMPROVE CONTENT-BASED AUDIO SIMILARITY

USING MUTUAL PROXIMITY TO IMPROVE CONTENT-BASED AUDIO SIMILARITY USING MUTUAL PROXIMITY TO IMPROVE CONTENT-BASED AUDIO SIMILARITY Dominik Schnitzer 1,2, Arthur Flexer 1, Markus Schedl 2, Gerhard Widmer 1,2 1 Austrian Research Institute for Artificial Intelligence (OFAI,

More information

Self-Organizing Maps for cyclic and unbounded graphs

Self-Organizing Maps for cyclic and unbounded graphs Self-Organizing Maps for cyclic and unbounded graphs M. Hagenbuchner 1, A. Sperduti 2, A.C. Tsoi 3 1- University of Wollongong, Wollongong, Australia. 2- University of Padova, Padova, Italy. 3- Hong Kong

More information

Cartographic Selection Using Self-Organizing Maps

Cartographic Selection Using Self-Organizing Maps 1 Cartographic Selection Using Self-Organizing Maps Bin Jiang 1 and Lars Harrie 2 1 Division of Geomatics, Institutionen för Teknik University of Gävle, SE-801 76 Gävle, Sweden e-mail: bin.jiang@hig.se

More information

parameters, network shape interpretations,

parameters, network shape interpretations, GIScience 20100 Short Paper Proceedings, Zurich, Switzerland, September. Formalizing Guidelines for Building Meaningful Self- Organizing Maps Jochen Wendel 1, Barbara. P. Buttenfield 1 1 Department of

More information

A Topography-Preserving Latent Variable Model with Learning Metrics

A Topography-Preserving Latent Variable Model with Learning Metrics A Topography-Preserving Latent Variable Model with Learning Metrics Samuel Kaski and Janne Sinkkonen Helsinki University of Technology Neural Networks Research Centre P.O. Box 5400, FIN-02015 HUT, Finland

More information

Clustering Part 4 DBSCAN

Clustering Part 4 DBSCAN Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of

More information

Multivariate Normals (MVN) Octave/Matlab Toolbox (Version 1)

Multivariate Normals (MVN) Octave/Matlab Toolbox (Version 1) Multivariate Normals (MVN) Octave/Matlab Toolbox (Version 1) Wednesday 31 st August, 2011 Contents 1 The Toolbox 1 1.1 Initialization............................. 2 1.2 Divergences..............................

More information

Controlling the spread of dynamic self-organising maps

Controlling the spread of dynamic self-organising maps Neural Comput & Applic (2004) 13: 168 174 DOI 10.1007/s00521-004-0419-y ORIGINAL ARTICLE L. D. Alahakoon Controlling the spread of dynamic self-organising maps Received: 7 April 2004 / Accepted: 20 April

More information

ESOM-Maps: tools for clustering, visualization, and classification with Emergent SOM

ESOM-Maps: tools for clustering, visualization, and classification with Emergent SOM ESOM-Maps: tools for clustering, visualization, and classification with Emergent SOM Alfred Ultsch, Fabian Mörchen Data Bionics Research Group, University of Marburg D-35032 Marburg, Germany March 17,

More information

arxiv: v1 [physics.data-an] 27 Sep 2007

arxiv: v1 [physics.data-an] 27 Sep 2007 Classification of Interest Rate Curves Using Self-Organising Maps arxiv:0709.4401v1 [physics.data-an] 27 Sep 2007 M.Kanevski a,, M.Maignan b, V.Timonin a,1, A.Pozdnoukhov a,1 a Institute of Geomatics and

More information

Machine Learning Based Autonomous Network Flow Identifying Method

Machine Learning Based Autonomous Network Flow Identifying Method Machine Learning Based Autonomous Network Flow Identifying Method Hongbo Shi 1,3, Tomoki Hamagami 1,3, and Haoyuan Xu 2,3 1 Division of Physics, Electrical and Computer Engineering, Graduate School of

More information

Visualizing Changes in Data Collections Using Growing Self-Organizing Maps *

Visualizing Changes in Data Collections Using Growing Self-Organizing Maps * Visualizing Changes in Data Collections Using Growing Self-Organizing Maps * Andreas Nürnberger and Marcin Detyniecki University of California at Berkeley EECS, Computer Science Division Berkeley, CA 94720,

More information

University of Florida CISE department Gator Engineering. Clustering Part 4

University of Florida CISE department Gator Engineering. Clustering Part 4 Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

SOMSN: An Effective Self Organizing Map for Clustering of Social Networks

SOMSN: An Effective Self Organizing Map for Clustering of Social Networks SOMSN: An Effective Self Organizing Map for Clustering of Social Networks Fatemeh Ghaemmaghami Research Scholar, CSE and IT Dept. Shiraz University, Shiraz, Iran Reza Manouchehri Sarhadi Research Scholar,

More information

Nonlinear dimensionality reduction of large datasets for data exploration

Nonlinear dimensionality reduction of large datasets for data exploration Data Mining VII: Data, Text and Web Mining and their Business Applications 3 Nonlinear dimensionality reduction of large datasets for data exploration V. Tomenko & V. Popov Wessex Institute of Technology,

More information

Slide07 Haykin Chapter 9: Self-Organizing Maps

Slide07 Haykin Chapter 9: Self-Organizing Maps Slide07 Haykin Chapter 9: Self-Organizing Maps CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Introduction Self-organizing maps (SOM) is based on competitive learning, where output neurons compete

More information

Two-step Modified SOM for Parallel Calculation

Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Petr Gajdoš and Pavel Moravec Petr Gajdoš and Pavel Moravec Department of Computer Science, FEECS, VŠB Technical

More information

Improving A Trajectory Index For Topology Conserving Mapping

Improving A Trajectory Index For Topology Conserving Mapping Proceedings of the 8th WSEAS Int. Conference on Automatic Control, Modeling and Simulation, Prague, Czech Republic, March -4, 006 (pp03-08) Improving A Trajectory Index For Topology Conserving Mapping

More information

Function approximation using RBF network. 10 basis functions and 25 data points.

Function approximation using RBF network. 10 basis functions and 25 data points. 1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data

More information

Advanced visualization of Self-Organizing. Maps with Vector Fields

Advanced visualization of Self-Organizing. Maps with Vector Fields Advanced visualization of Self-Organizing Maps with Vector Fields Georg Pölzlbauer a Michael Dittenbach b Andreas Rauber a,b a Department of Software Technology, Vienna University of Technology, Favoritenstrasse

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

Supervised vs.unsupervised Learning

Supervised vs.unsupervised Learning Supervised vs.unsupervised Learning In supervised learning we train algorithms with predefined concepts and functions based on labeled data D = { ( x, y ) x X, y {yes,no}. In unsupervised learning we are

More information

Gradient visualization of grouped component planes on the SOM lattice

Gradient visualization of grouped component planes on the SOM lattice Gradient visualization of grouped component planes on the SOM lattice Gradient visualization of grouped component planes on the SOM lattice Georg Pölzlbauer 1, Michael Dittenbach 2, Andreas Rauber 1 1

More information

Artificial Neural Networks Unsupervised learning: SOM

Artificial Neural Networks Unsupervised learning: SOM Artificial Neural Networks Unsupervised learning: SOM 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001

More information

Chapter 7: Competitive learning, clustering, and self-organizing maps

Chapter 7: Competitive learning, clustering, and self-organizing maps Chapter 7: Competitive learning, clustering, and self-organizing maps António R. C. Paiva EEL 6814 Spring 2008 Outline Competitive learning Clustering Self-Organizing Maps What is competition in neural

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,

More information

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Exploratory data analysis tasks Examine the data, in search of structures

More information

CHAPTER FOUR NEURAL NETWORK SELF- ORGANIZING MAP

CHAPTER FOUR NEURAL NETWORK SELF- ORGANIZING MAP 96 CHAPTER FOUR NEURAL NETWORK SELF- ORGANIZING MAP 97 4.1 INTRODUCTION Neural networks have been successfully applied by many authors in solving pattern recognition problems. Unsupervised classification

More information

Bringing Mobile Map Based Access to Digital Audio to the End User

Bringing Mobile Map Based Access to Digital Audio to the End User Bringing Mobile Map Based Access to Digital Audio to the End User Robert Neumayer, Jakob Frank, Peter Hlavac, Thomas Lidy and Andreas Rauber Vienna University of Technology Department of Software Technology

More information

Unsupervised learning

Unsupervised learning Unsupervised learning Enrique Muñoz Ballester Dipartimento di Informatica via Bramante 65, 26013 Crema (CR), Italy enrique.munoz@unimi.it Enrique Muñoz Ballester 2017 1 Download slides data and scripts:

More information

/00/$10.00 (C) 2000 IEEE

/00/$10.00 (C) 2000 IEEE A SOM based cluster visualization and its application for false coloring Johan Himberg Helsinki University of Technology Laboratory of Computer and Information Science P.O. Box 54, FIN-215 HUT, Finland

More information

Clustering Algorithms for general similarity measures

Clustering Algorithms for general similarity measures Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative

More information

Seismic facies analysis using generative topographic mapping

Seismic facies analysis using generative topographic mapping Satinder Chopra + * and Kurt J. Marfurt + Arcis Seismic Solutions, Calgary; The University of Oklahoma, Norman Summary Seismic facies analysis is commonly carried out by classifying seismic waveforms based

More information

Innovative User Interfaces for Accessing Music Libraries on Mobile Devices

Innovative User Interfaces for Accessing Music Libraries on Mobile Devices Innovative User Interfaces for Accessing Music Libraries on Mobile Devices A SOM Based Music Browser for Mobile Devices Peter Hlavac Department of Software Technology, Vienna University of Technology Favoritenstrasse

More information

Methods for Intelligent Systems

Methods for Intelligent Systems Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering

More information

MUSI-6201 Computational Music Analysis

MUSI-6201 Computational Music Analysis MUSI-6201 Computational Music Analysis Part 9.2: Music Similarity and Mood Recognition alexander lerch November 11, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood

More information

Project Participants

Project Participants Annual Report for Period:10/2004-10/2005 Submitted on: 06/21/2005 Principal Investigator: Yang, Li. Award ID: 0414857 Organization: Western Michigan Univ Title: Projection and Interactive Exploration of

More information

Knowledge Based Document Management System for Free-Text Documents Discovery

Knowledge Based Document Management System for Free-Text Documents Discovery Knowledge Based Document Management System for Free-Text Documents Discovery 1 Paul D Manuel 2, Mostafa Ibrahim Abd-El Barr 3, S. Thamarai Selvi 4 2 Department of Information Science, College for Women

More information

From Improved Auto-taggers to Improved Music Similarity Measures

From Improved Auto-taggers to Improved Music Similarity Measures From Improved Auto-taggers to Improved Music Similarity Measures Klaus Seyerlehner 1, Markus Schedl 1, Reinhard Sonnleitner 1, David Hauger 1, and Bogdan Ionescu 2 1 Johannes Kepler University Department

More information

SOM+EOF for Finding Missing Values

SOM+EOF for Finding Missing Values SOM+EOF for Finding Missing Values Antti Sorjamaa 1, Paul Merlin 2, Bertrand Maillet 2 and Amaury Lendasse 1 1- Helsinki University of Technology - CIS P.O. Box 5400, 02015 HUT - Finland 2- Variances and

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,

More information

Process. Measurement vector (Feature vector) Map training and labeling. Self-Organizing Map. Input measurements 4. Output measurements.

Process. Measurement vector (Feature vector) Map training and labeling. Self-Organizing Map. Input measurements 4. Output measurements. Analysis of Complex Systems using the Self-Organizing Map Esa Alhoniemi, Olli Simula and Juha Vesanto Helsinki University of Technology Laboratory of Computer and Information Science P.O. Box 2200, FIN-02015

More information

Lecture Topic Projects

Lecture Topic Projects Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, basic tasks, data types 3 Introduction to D3, basic vis techniques for non-spatial data Project #1 out 4 Data

More information

Data analysis and inference for an industrial deethanizer

Data analysis and inference for an industrial deethanizer Data analysis and inference for an industrial deethanizer Francesco Corona a, Michela Mulas b, Roberto Baratti c and Jose Romagnoli d a Dept. of Information and Computer Science, Helsinki University of

More information

Data Exploration with PCA and Unsupervised Learning with Clustering Paul Rodriguez, PhD PACE SDSC

Data Exploration with PCA and Unsupervised Learning with Clustering Paul Rodriguez, PhD PACE SDSC Data Exploration with PCA and Unsupervised Learning with Clustering Paul Rodriguez, PhD PACE SDSC Clustering Idea Given a set of data can we find a natural grouping? Essential R commands: D =rnorm(12,0,1)

More information

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks Series Prediction as a Problem of Missing Values: Application to ESTSP7 and NN3 Competition Benchmarks Antti Sorjamaa and Amaury Lendasse Abstract In this paper, time series prediction is considered as

More information

Decision Manifolds: Classification Inspired by Self-Organization

Decision Manifolds: Classification Inspired by Self-Organization Decision Manifolds: Classification Inspired by Self-Organization Georg Pölzlbauer, Thomas Lidy, Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology Favoritenstr.

More information

Cluster analysis of 3D seismic data for oil and gas exploration

Cluster analysis of 3D seismic data for oil and gas exploration Data Mining VII: Data, Text and Web Mining and their Business Applications 63 Cluster analysis of 3D seismic data for oil and gas exploration D. R. S. Moraes, R. P. Espíndola, A. G. Evsukoff & N. F. F.

More information

CS570: Introduction to Data Mining

CS570: Introduction to Data Mining CS570: Introduction to Data Mining Scalable Clustering Methods: BIRCH and Others Reading: Chapter 10.3 Han, Chapter 9.5 Tan Cengiz Gunay, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei.

More information

MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION. Steve Tjoa June 25, 2014

MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION. Steve Tjoa June 25, 2014 MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION Steve Tjoa kiemyang@gmail.com June 25, 2014 Review from Day 2 Supervised vs. Unsupervised Unsupervised - clustering Supervised binary classifiers (2 classes)

More information

Organizing and Visualizing Software Repositories Using the Growing Hierarchical Self-Organizing Map

Organizing and Visualizing Software Repositories Using the Growing Hierarchical Self-Organizing Map JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 22, 283-295 (2006) Organizing and Visualizing Software Repositories Using the Growing Hierarchical Self-Organizing Map SONGSRI TANGSRIPAIROJ AND M. H. SAMADZADEH

More information

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked Plotting Menu: QCExpert Plotting Module graphs offers various tools for visualization of uni- and multivariate data. Settings and options in different types of graphs allow for modifications and customizations

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

DATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm

DATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm DATA MINING LECTURE 7 Hierarchical Clustering, DBSCAN The EM Algorithm CLUSTERING What is a Clustering? In general a grouping of objects such that the objects in a group (cluster) are similar (or related)

More information

Measure of Distance. We wish to define the distance between two objects Distance metric between points:

Measure of Distance. We wish to define the distance between two objects Distance metric between points: Measure of Distance We wish to define the distance between two objects Distance metric between points: Euclidean distance (EUC) Manhattan distance (MAN) Pearson sample correlation (COR) Angle distance

More information

Using Self-Organizing Maps for Sentiment Analysis. Keywords Sentiment Analysis, Self-Organizing Map, Machine Learning, Text Mining.

Using Self-Organizing Maps for Sentiment Analysis. Keywords Sentiment Analysis, Self-Organizing Map, Machine Learning, Text Mining. Using Self-Organizing Maps for Sentiment Analysis Anuj Sharma Indian Institute of Management Indore 453331, INDIA Email: f09anujs@iimidr.ac.in Shubhamoy Dey Indian Institute of Management Indore 453331,

More information

Self-organization of very large document collections

Self-organization of very large document collections Chapter 10 Self-organization of very large document collections Teuvo Kohonen, Samuel Kaski, Krista Lagus, Jarkko Salojärvi, Jukka Honkela, Vesa Paatero, Antti Saarela Text mining systems are developed

More information

Component Selection for the Metro Visualisation of the Self-Organising Map

Component Selection for the Metro Visualisation of the Self-Organising Map Component Selection for the Metro Visualisation of the Self-Organising Map Robert Neumayer, Rudolf Mayer, and Andreas Rauber Vienna University of Technology, Department of Software Technology and Interactive

More information

Clustering & Classification (chapter 15)

Clustering & Classification (chapter 15) Clustering & Classification (chapter 5) Kai Goebel Bill Cheetham RPI/GE Global Research goebel@cs.rpi.edu cheetham@cs.rpi.edu Outline k-means Fuzzy c-means Mountain Clustering knn Fuzzy knn Hierarchical

More information

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS Toomas Kirt Supervisor: Leo Võhandu Tallinn Technical University Toomas.Kirt@mail.ee Abstract: Key words: For the visualisation

More information

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003

CS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003 CS 664 Slides #11 Image Segmentation Prof. Dan Huttenlocher Fall 2003 Image Segmentation Find regions of image that are coherent Dual of edge detection Regions vs. boundaries Related to clustering problems

More information

Background. Parallel Coordinates. Basics. Good Example

Background. Parallel Coordinates. Basics. Good Example Background Parallel Coordinates Shengying Li CSE591 Visual Analytics Professor Klaus Mueller March 20, 2007 Proposed in 80 s by Alfred Insellberg Good for multi-dimensional data exploration Widely used

More information

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive

More information

Clustering. Lecture 6, 1/24/03 ECS289A

Clustering. Lecture 6, 1/24/03 ECS289A Clustering Lecture 6, 1/24/03 What is Clustering? Given n objects, assign them to groups (clusters) based on their similarity Unsupervised Machine Learning Class Discovery Difficult, and maybe ill-posed

More information

Mineral Exploation Using Neural Netowrks

Mineral Exploation Using Neural Netowrks ABSTRACT I S S N 2277-3061 Mineral Exploation Using Neural Netowrks Aysar A. Abdulrahman University of Sulaimani, Computer Science, Kurdistan Region of Iraq aysser.abdulrahman@univsul.edu.iq Establishing

More information

Line Simplification Using Self-Organizing Maps

Line Simplification Using Self-Organizing Maps Line Simplification Using Self-Organizing Maps Bin Jiang Division of Geomatics, Dept. of Technology and Built Environment, University of Gävle, Sweden. Byron Nakos School of Rural and Surveying Engineering,

More information

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2

More information

ECS 234: Data Analysis: Clustering ECS 234

ECS 234: Data Analysis: Clustering ECS 234 : Data Analysis: Clustering What is Clustering? Given n objects, assign them to groups (clusters) based on their similarity Unsupervised Machine Learning Class Discovery Difficult, and maybe ill-posed

More information

Clustering algorithms and introduction to persistent homology

Clustering algorithms and introduction to persistent homology Foundations of Geometric Methods in Data Analysis 2017-18 Clustering algorithms and introduction to persistent homology Frédéric Chazal INRIA Saclay - Ile-de-France frederic.chazal@inria.fr Introduction

More information

Clustering Algorithms for Data Stream

Clustering Algorithms for Data Stream Clustering Algorithms for Data Stream Karishma Nadhe 1, Prof. P. M. Chawan 2 1Student, Dept of CS & IT, VJTI Mumbai, Maharashtra, India 2Professor, Dept of CS & IT, VJTI Mumbai, Maharashtra, India Abstract:

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6 Cluster Analysis and Visualization Workshop on Statistics and Machine Learning 2004/2/6 Outlines Introduction Stages in Clustering Clustering Analysis and Visualization One/two-dimensional Data Histogram,

More information

An Approach to Automatically Tracking Music Preference on Mobile Players

An Approach to Automatically Tracking Music Preference on Mobile Players An Approach to Automatically Tracking Music Preference on Mobile Players Tim Pohle, 1 Klaus Seyerlehner 1 and Gerhard Widmer 1,2 1 Department of Computational Perception Johannes Kepler University Linz,

More information

Machine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016

Machine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016 Machine Learning for Signal Processing Clustering Bhiksha Raj Class 11. 13 Oct 2016 1 Statistical Modelling and Latent Structure Much of statistical modelling attempts to identify latent structure in the

More information