Self-Organizing Maps (SOM)
|
|
- Prudence Harvey
- 6 years ago
- Views:
Transcription
1 Overview (SOM) Basics Sequential Training (On-Line Learning) Batch SOM Visualizing the SOM - SOM Grid - Music Description Map (MDM) - Bar Plots and Chernoff's Faces - U-Matrix and Distance Matrix - Smoothed Data Histogram (SDH) - Component Planes Univ.-Ass. Dr. Markus Schedl Department of Computational Perception Johannes Kepler University Linz Growing Hierarchical SOM Aligned SOM markus.schedl@jku.at Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 2 Self-Organizing Map (SOM): Basics SOM: "neural network" [Kohonen, 1982], [Kohonen, 2001] SOM ~ k-means clustering + topology preservation preservation of non-linear relationship between data items Basic Architecture: Map: 2-dimensional array of interconnected units ("neurons") connections define fixed topology "neighborhood" units represent cluster centers (prototypes, "model vectors", "weight vector", "reference vector") Different Topologies / Grid Structures Interpretation: clustering with topology constraints (similar data items should be placed close to each other on the map) mapping from data/feature/input space to low-dim. visualization space + tighter relationship between clusters + more connections + grid structure fits Gaussian structure in neighborhood kernel calculation (centroids of neighboring map units are equidistant) + easier to implement diagonally neighboring map units do not perfectly fit to Gaussian neighborhood function 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 4
2 An Application of the Self-Organizing Map: The neptune Interface On-line Learning: The Online Training Algorithm Input: map of units u i with model vectors m i ("codebook") training instances X = {x i } a similarity measure sim(.,.) between data items (e.g., Euclidean distance) parameters: α(t) (learning rate [0..1]) and a neighborhood kernel function with parameter r(t) ( neighborhood radius ), 2 2 e.g., pseudo-gaussian u ( t) = exp( d r( t) ) (d ij = map distance btw. u i, u j ) Online SOM Training Algorithm (one possible variant): ij Initialize each unit (model vector) m i to represent a randomly selected data item Loop over time steps t, until convergence: 1. Randomly select an example x 2. Find the winning unit (best matching unit) u c with m c = max i (sim(m i,x)) 3. Adapt model vectors of all units as m i (t +1) = m i (t)+ α(t) u ic (t) [x m i (t)] 4. Update (decrease) training parameters α(t), r(t) ij 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 6 Off-line Learning: The Batch SOM Algorithm SOM: Illustration Input: map of units u i with model vectors m i training instances X = {x i } a similarity measure between examples (e.g., Euclidean distance) a neighborhood kernel function with parameter r(t) ( neighborhood radius ), 2 2 e.g., pseudo-gaussian u ( t) = exp( d r( t) ) (d ij = map distance btwn. u i, u j ) Batch SOM Training Algorithm (one possible variant): ij Initialize each unit (model vector) m i to represent a randomly selected data item Loop over time steps t, until convergence: 1. Determine the best matching unit u c(i) for each data item x i (i.e., assign each instance to its most similar model vector) Voronoi set 2. Update each model vector m i to better fit the data items assigned to it and the data in its neighborhood: u ( ) ( t) k ic k x k m i ( t + 1) = u ( t) 3. Update (decrease) neighborhood radius r(t) ij k ic( k ) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 8
3 SOM: Illustration Initialization of the Model Vectors Random Initialization: - random values in same range as X (between min and max of each dimension) - randomly select data items from X and assign them to model vectors m i + fast mapping not consistent for different runs each data point (example) x uniquely belongs to a unit (the BMU of x) relationship between units: neighboring units cover similar data items non-uniform distances between model vectors, uniform distances in visualization "interpolation units" (units with no data associated) are possible Linear Initialization: perform Eigendecomposition of autocorrelation matrix of X PCA top 2 Eigenvectors (with largest Eigenvalues) span a 2-dimensional subspace initialize model vectors along these Eigenvectors predefined linear mapping to start with + mapping consistent for different runs (up to rotation / mirroring) computationally more complex 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 10 Example: WebSOM Project Example: Browsing Music Collections Support in Browsing (Potentially Huge) Data Sets [Kaski et al., 1998] ViSMuC by Schedl, Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 12
4 Example: Browsing Music Collections (II) Visualizing the SOM Visualizing attribute distributions on top of a learned SOM: Component Planes: visualize feature values of model vectors associated with the map units (or averaged feature values over all instances covered by a unit) Bar Charts or Chernoff's Faces: visualize all dimensions of model vectors for each map unit in one plot [Vesanto, 1999], [Vesanto, 2002] PlaySOM, TU Wien, Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 15 Visualizing the SOM Visualizing Attribute Distributions: Component Planes Visualizing the data distribution on top of a learned SOM: Learned Map Component Planes SOM-Grid: each data item is displayed within its BMU Music Description Map (MDM): aggregates similar map units and add descriptive labels [Knees et al., 2006] Horse Zebra Cow Tiger Lion Fox Dog Wolf Small Medium Big 2-Legs 4-Legs Hair Hooves Mane U-Matrix: visualizes distances between units (via color) Cat Feathers Hunt Run Fly Distance Matrix: visualize aggregated distances of model vectors to all neighboring units [Vesanto, 1999], [Vesanto, 2002] Duck Goose Dove Chicken Owl Hawk Eagle Swim Smoothed Data Histogram (SDH): visualizes (smoothed) density of data items in an area [Pampalk et al., 2002] - explain mapping (labeling) - make correlations between attributes visible 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 17
5 Visualizing Attribute Distributions: Bar Plots each attribute value (dimension in data space) is displayed via a bar in a d-dimensional bar chart visualization for each map unit Visualizing Attribute Distributions: Chernoff's Faces psychologically motivated visualization method (people can quickly grasp a face's expression) each attribute value (dimension in data space) is mapped to a specific property of the Chernoff face (e.g., mouth's contour, height/width of face, ear's slope, ) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 20 Visualizing the SOM SOM Grid for data set C103a: co-occurrences of artist names Visualizing the data distribution on top of a learned SOM: SOM-Grid: each data item is displayed within its BMU Music Description Map (MDM): aggregates similar map units and add descriptive labels [Knees et al., 2006] U-Matrix: visualizes distances between units (via color) Distance Matrix: visualize aggregated distances of model vectors to all neighboring units [Vesanto, 1999], [Vesanto, 2002] Smoothed Data Histogram (SDH): visualizes (smoothed) density of data items in an area [Pampalk et al., 2002] 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 22
6 SOM Grid for data set C103a: co-occurrences of artist names (II) SOM Grid for larger data set 2572 songs, 7 genres, features: MFCCs 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 24 SOM Grid for larger data set: Aggregate data items using metadata metadata available summarize items w.r.t. properties (e.g., genre) Music Description Map (MDM) [Knees et al., 2006] - extension of the simple SOM grid - describes regions of the map by metadata - aggregates "similar" neighboring map units via region growing algorithm loss of information: Dance? 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 26
7 MDM (II): Labeling Map Units MDM (II): Connecting Similar Map Units determining the goodness G 2 t,u of a term t for map unit u according to [Lagus, Kaski, 1999]: 1. sort all units w.r.t. G 2 -values of contained terms U u k A0 if Manhattan distance between units u and k < threshold r 0 u i A if Manhattan distance between units u and i: r 0 < d(u,i) < r 1 1 G F 2 t, u t, u = 2 ( u Ft, k ) k A 0 u F i A t, i 1 f a a, u tft, fa, u = v a a tf #tracks of artist a on unit u v, a term frequency of term t for artist a 2. remove highest ranked unit u U, find similarly labeled units among u's neighbors if cosine similarity between label vectors of map unit u and its neighbors i < threshold θ, aggregate u and i 3. goto 2 filter all terms t with G 2 t,u < 0.01 cut-off of 30 keywords per map unit [Lagus, Kaski, 1999] 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 28 Visualizing Data Distributions: U-Matrix and Distance Matrix U-Matrix: visualizes distances between units (model vectors) Distance Matrix: visualizes difference of a unit's model vector to all neighboring units' model vectors Visualizing Data Distributions: U-Matrix and SDH Two methods for visualizing data on top of a learned SOM: U-Matrix: visualizes distances between units (via color) Smoothed Data Histogram (SDH): visualizes (smoothed) density of examples in an area of the map 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 30
8 Smoothed Data Histograms (SDH) [Pampalk et al., 2002] SOM and SDH: An Example display smoothed density of data items associated with areas of the map reveal clusters in the data many pieces associated with a unit cluster center Idea for smoothing / density estimation: voting matrix whose size equals size of SOM data items vote for a number N of best-matching units best-matching unit gets N points, 2nd best gets N-1 points, N-th best gets 1 point, all others get 0 points (N is parameter, spread ) the distribution of votes is visualized over the entire map, e.g., via a color map (interpolated voting matrix for smoothing) Data Space Visualization Space N=1 N=2 N=5 N=7 N=10 N= Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 32 Smoothed Data Histograms SOM and SDH A Sample Application neptune Be aware of influence of color scale on perception! Input: music collection (digital audio files) calculate audio features for each track, e.g. rhythmic [Pampalk, Islands of Music: Analysis, Organization, and Visualization of Music Archives, Diploma Thesis 2001] timbral [Mandel & Ellis, Song-Level Features and Support Vector Machines for Music Classification, ISMIR 2005] train a SOM on audio features calculate an SDH on the SOM visualize SDH in 3D using smoothed voting matrix of SDH as height values build a game-like user interface to explore the user s (or someone else s) music collection Matlab implementations of SOMs and SDHs (Toolboxes): (Google: SOM Toolbox ) (Google: SDH Toolbox ) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 34
9 neptune (2) neptune (3) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 36 neptune (4) neptune (5) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 38
10 Hierarchical Structuring: The Growing Hierarchical Self-Organizing Map (GHSOM) Flat SOM: [Dittenbach et al., 2002] The GHSOM Algorithm Start with 1 unit to expand (= mean of data), level 0 Loop until no more units to expand 1. For each unit to expand create new 2x2 SOM (initialize orientation) 2. Train SOM on data assigned to parent unit 3. Decision 1: Insert new row or column? If yes: insert new row/column and goto 2 4. Decision 2: hierarchically expand units of map? If yes: add units to expand list Hierarchical SOM: Decision 1: Insert new row or column if mean quantization error > threshold (i.e., map does not represent the data well); insert new row or column between unit with highest quantization error and adjacent unit with largest distance Decision 2: Expand unit if quantization error of unit > threshold (i.e., unit does not represent its associated data items well) Parameters: same as SOM (except no. of units) + 2 thresholds τ 1, τ Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 41 The GHSOM Algorithm: Decisions 1 and 2 Mean quantization error of unit u i : Voronoi set V i of unit u i : all data items whose BMU is u i 1 MQE = xk m i i Vi k V i where V i = { k uc( k ) = ui} quantifies how well a unit i approximates its data items Mean quantization error of a SOM: MMQE V = i i n MQE i quantifies how well a SOM approximates the data items The GHSOM Algorithm: Decisions 1 (enlarge map) and 2 (insert new map) Decision 1 : Insert new row or column if MMQE > τ 1 MQE 0 where MQE 0 is the MQE of a virtual unit m 0 representing the mean of all instances covered by the parent unit: m 0 = x i n i Decision 2 : Expand unit if MQE i > τ 2 MQE 0 * where MQE 0* is the mean quantization error of the whole dataset with respect to the virtual unit located in the center of the whole dataset (in contrast to MQE 0, which is the mean quantization error of the data items in the respective sub-branch of the GHSOM) Generally: τ 1, τ 2 are chosen such that 1 > τ 1 >> τ 2 > Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 43
11 The GHSOM Algorithm: Preservation of Orientation Problem: maps of descendants of a unit u i could have arbitrary orientation no visible relationship between different sub-branches (other than common parent map) Solution: enforce/encourage a specific orientation of the sub-layer SOMs via initialization initialize the model vectors of the 2x2 SOMs such that they correspond to the orientation of the parent map for example: initialize the 4 model vectors with the means of the parent vector and each of its 4 immediate neighbors for border units: extrapolate "virtual" units. Example: if u i is located on the left border and the unit to its right is u r, create virtual left neighbor u l with m l = m i + (m i m r ) Exercise: How could the initialization function for the codebook of a new sublevel SOM expressed as weighted parent unit(s' neighbors) look like? Hierarchical Map GHSOM on Animals Hierarchical Component Planes 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 45 GHSOM + SDH: deeptune GHSOM + SDH: deeptune (II) Different Hierarchy Levels 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 47
12 Visualizing Effects of Changes in Data Definition: Aligned SOMs [Pampalk et al., 2003] Basic concepts: Goal: understand relationship between different ways of representing the same data layers of mutually constrained SOMs (i.e., a stack of SOMs) each layer trained on slightly different data space / view of the data (i.e., different dimensions or distance definitions), but same data items trained so that all layers have same orientation constraints between layers to enforce smooth transitions between views p min Aligned SOMs: The Basic Architecture p max Parameter Values (define different views of the data) Stack of SOMs Use: exploratory analysis of alternative data representations visualize changes in the inherent structure of the data in response to changes in features, relative feature weights, different ways of normalizing features, different similarity functions,... navigation through alternative data spaces 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 49 Distance between layers (relative to distance between units in same layer) E.g., intra-som distance between neighboring units = 1 inter-som distance "between" same map unit = 1/ Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 50 Initialize all layers Loop Randomly select training instance x and layer l Find best matching unit for x in l Adapt neighborhood of best matching unit (intra- and inter-layer neighborhood) Neighborhood: Aligned SOM: Training (Online version, simplified) within layer between layers 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 51 Aligned SOM: On-line Learning Input: map of units u li with model vectors m li ("codebook"), l layer training instances X = {x i } a similarity measure sim(.,.) between data items (e.g., Euclidean distance) parameters: α(t) (learning rate [0..1]) and a neighborhood kernel function with parameter r(t) ( neighborhood radius ), e.g., pseudo-gaussian 2 2 (d ij = map distance btw. u li, u kj ) uij ( t) = exp( dij r( t) ) Online SOM Training Algorithm: Initialize each unit (model vector) m li to represent a randomly selected data item (features weighted according to layer-specific weights, e.g., from 1:0 to 0:1) Loop over time steps t, until convergence: 1. Randomly select an example x and a layer l; apply weighting according to view/data space of l x l 2. Find the winning unit (best matching unit) u c with m c = max i (sim(m li,x l )) 3. Adapt model vectors of all units in all layers as m li (t +1) = m li (t) + α(t) u ic (t) [x l m li (t)] 4. Update (decrease) training parameters α(t), r(t) 2010 Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 52
13 Aligned SOM on Animals Aligned SOM Demos Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 54 Literature SOM: [Kohonen, 1982]: Kohonen, T. Self-Organizing Formation of Topologically Correct Feature Maps. Biological Cybernetics, 43: [Kohonen, 2001]: Kohonen, T., volume 30 of Springer Series in Information Sciences. Springer, Berlin, Germany, 3rd edition. [Vesanto, 1999]: Vesanto, J. SOM-Based Data Visualization Methods. Intelligent Data Analysis 3(2): [Vesanto, 2002]: Vesanto, J. Data Exploration Process Based on the Self-Organizing Map. PhD thesis, Helsinki University of Technology, Espoo, Finland. [Pampalk et al., 2002]: Pampalk, E., Rauber, A., and Merkl, D. Using Smoothed Data Histograms for Cluster Visualization in. In Proceedings of the International Conference on Artificial Neural Networks (ICANN 2002), Madrid, Spain. Springer. [Knees et al., 2006]: Knees, P., Pohle, T., Schedl, M., and Widmer, G. Automatically Describing Music on a Map. In Proceedings of the 2nd Workshop on Learning the Semantics of Audio Signals (LSAS 2008), Paris, France, June [Kaski et al., 1998]: WEBSOM of Document Collections, Neurocomputing 21, Literature (II) GHSOM: [Dittenbach et al., 2002]: Dittenbach, M., Rauber, A., and Merkl, D. Uncovering Hierarchical Structure in Data Using the Growing Hierarchical Self-Organizing Map. Neurocomputing, 48(1 4): Aligned SOM: [Pampalk et al. 2003]: Pampalk, E., Goebl, W., Widmer, G. Visualizing Changes in the Structure of Data for Exploratory Feature Selection, In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003). [Lagus, Kaski, 1999]: Keyword Selection Method for Characterising Text Document Maps, In Proceedings of the International Conference on Artificial Neural Networks (ICANN 1999), London, UK Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees Markus Schedl, partly based on material by Gerhard Widmer and Peter Knees 56
Graph projection techniques for Self-Organizing Maps
Graph projection techniques for Self-Organizing Maps Georg Pölzlbauer 1, Andreas Rauber 1, Michael Dittenbach 2 1- Vienna University of Technology - Department of Software Technology Favoritenstr. 9 11
More informationAdvanced visualization techniques for Self-Organizing Maps with graph-based methods
Advanced visualization techniques for Self-Organizing Maps with graph-based methods Georg Pölzlbauer 1, Andreas Rauber 1, and Michael Dittenbach 2 1 Department of Software Technology Vienna University
More informationplan agent skeletal durative asbru design real world domain skeletal plan asbru limitation
LabelSOM: On the Labeling of Self-Organizing Maps Andreas Rauber Institut fur Softwaretechnik, Technische Universitat Wien Resselgasse 3/188, A{1040 Wien, Austria http://www.ifs.tuwien.ac.at/~andi Abstract
More informationVisualizing Changes in the Structure of Data for Exploratory Feature Selection
Visualizing Changes in the Structure of Data for Exploratory Feature Selection Elias Pampalk 1, Werner Goebl 1, and Gerhard Widmer 1,2 1 Austrian Research Institute for Artificial Intelligence (OeFAI)
More informationFree Projection SOM: A New Method For SOM-Based Cluster Visualization
Free Projection SOM: A New Method For SOM-Based Cluster Visualization 1 ABDEL-BADEEH M. SALEM 1, EMAD MONIER, KHALED NAGATY Computer Science Department Ain Shams University Faculty of Computer & Information
More informationA SOM-view of oilfield data: A novel vector field visualization for Self-Organizing Maps and its applications in the petroleum industry
A SOM-view of oilfield data: A novel vector field visualization for Self-Organizing Maps and its applications in the petroleum industry Georg Pölzlbauer, Andreas Rauber (Department of Software Technology
More informationExperimental Analysis of GTM
Experimental Analysis of GTM Elias Pampalk In the past years many different data mining techniques have been developed. The goal of the seminar Kosice-Vienna is to compare some of them to determine which
More informationOrdered Vector Quantization for the Integrated Analysis of Geochemical and Geoscientific Data Sets
Ordered Vector Quantization for the Integrated Analysis of Geochemical and Geoscientific Data Sets Stephen Fraser 1 & Bruce Dickson 2 We are drowning in information and starving for knowledge. Rutherford
More informationA Document-centered Approach to a Natural Language Music Search Engine
A Document-centered Approach to a Natural Language Music Search Engine Peter Knees, Tim Pohle, Markus Schedl, Dominik Schnitzer, and Klaus Seyerlehner Dept. of Computational Perception, Johannes Kepler
More informationFigure (5) Kohonen Self-Organized Map
2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;
More informationAutomatically Adapting the Structure of Audio Similarity Spaces
Automatically Adapting the Structure of Audio Similarity Spaces Tim Pohle 1, Peter Knees 1, Markus Schedl 1 and Gerhard Widmer 1,2 1 Department of Computational Perception Johannes Kepler University Linz,
More informationUnsupervised Learning
Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised
More informationA visualization technique for Self-Organizing Maps with vector fields to obtain the cluster structure at desired levels of detail
A visualization technique for Self-Organizing Maps with vector fields to obtain the cluster structure at desired levels of detail Georg Pölzlbauer Department of Software Technology Vienna University of
More informationA vector field visualization technique for Self-Organizing Maps
A vector field visualization technique for Self-Organizing Maps Georg Pölzlbauer 1, Andreas Rauber 1, and Michael Dittenbach 2 1 Department of Software Technology Vienna University of Technology Favoritenstr.
More informationAn Explorative, Hierarchical User Interface to Structured Music Repositories
An Explorative, Hierarchical User Interface to Structured Music Repositories Markus Schedl December 2003 Abstract Due to efficient compression algorithms like MP3, the number and size of digital music
More informationInvestigation of Alternative Strategies and Quality Measures for Controlling the Growth Process of the Growing Hierarchical Self-Organizing Map
Investigation of Alternative Strategies and Quality Measures for Controlling the Growth Process of the Growing Hierarchical Self-Organizing Map Michael Dittenbach ispaces Group ecommerce Competence Center
More informationACCESSING MUSIC COLLECTIONS VIA REPRESENTATIVE CLUSTER PROTOTYPES IN A HIERARCHICAL ORGANIZATION SCHEME
ACCESSING MUSIC COLLECTIONS VIA REPRESENTATIVE CLUSTER PROTOTYPES IN A HIERARCHICAL ORGANIZATION SCHEME Markus Dopler Markus Schedl Tim Pohle Peter Knees Department of Computational Perception Johannes
More informationTechnical Report: The CoMIRVA Toolkit for Visualizing Music-Related Data
Technical Report: The CoMIRVA Toolkit for Visualizing Music-Related Data Markus Schedl Department of Computational Perception Johannes Kepler University Linz, Austria Figure 1: Our muscape application
More informationBastian Wormuth. Version About this Manual
Elba User Manual Table of Contents Bastian Wormuth Version 0.1 1 About this Manual...1 2 Overview...2 3 Starting Elba...3 4 Establishing the database connection... 3 5 Elba's Main Window... 5 6 Creating
More informationVisualization and Clustering of Tagged Music Data
Visualization and Clustering of Tagged Music Data Pascal Lehwark 1, Sebastian Risi 2, and Alfred Ultsch 3 1 Databionics Research Group, Philipps University Marburg pascal@indiji.com 2 Databionics Research
More information5.6 Self-organizing maps (SOM) [Book, Sect. 10.3]
Ch.5 Classification and Clustering 5.6 Self-organizing maps (SOM) [Book, Sect. 10.3] The self-organizing map (SOM) method, introduced by Kohonen (1982, 2001), approximates a dataset in multidimensional
More informationUSING MUTUAL PROXIMITY TO IMPROVE CONTENT-BASED AUDIO SIMILARITY
USING MUTUAL PROXIMITY TO IMPROVE CONTENT-BASED AUDIO SIMILARITY Dominik Schnitzer 1,2, Arthur Flexer 1, Markus Schedl 2, Gerhard Widmer 1,2 1 Austrian Research Institute for Artificial Intelligence (OFAI,
More informationSelf-Organizing Maps for cyclic and unbounded graphs
Self-Organizing Maps for cyclic and unbounded graphs M. Hagenbuchner 1, A. Sperduti 2, A.C. Tsoi 3 1- University of Wollongong, Wollongong, Australia. 2- University of Padova, Padova, Italy. 3- Hong Kong
More informationCartographic Selection Using Self-Organizing Maps
1 Cartographic Selection Using Self-Organizing Maps Bin Jiang 1 and Lars Harrie 2 1 Division of Geomatics, Institutionen för Teknik University of Gävle, SE-801 76 Gävle, Sweden e-mail: bin.jiang@hig.se
More informationparameters, network shape interpretations,
GIScience 20100 Short Paper Proceedings, Zurich, Switzerland, September. Formalizing Guidelines for Building Meaningful Self- Organizing Maps Jochen Wendel 1, Barbara. P. Buttenfield 1 1 Department of
More informationA Topography-Preserving Latent Variable Model with Learning Metrics
A Topography-Preserving Latent Variable Model with Learning Metrics Samuel Kaski and Janne Sinkkonen Helsinki University of Technology Neural Networks Research Centre P.O. Box 5400, FIN-02015 HUT, Finland
More informationClustering Part 4 DBSCAN
Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of
More informationMultivariate Normals (MVN) Octave/Matlab Toolbox (Version 1)
Multivariate Normals (MVN) Octave/Matlab Toolbox (Version 1) Wednesday 31 st August, 2011 Contents 1 The Toolbox 1 1.1 Initialization............................. 2 1.2 Divergences..............................
More informationControlling the spread of dynamic self-organising maps
Neural Comput & Applic (2004) 13: 168 174 DOI 10.1007/s00521-004-0419-y ORIGINAL ARTICLE L. D. Alahakoon Controlling the spread of dynamic self-organising maps Received: 7 April 2004 / Accepted: 20 April
More informationESOM-Maps: tools for clustering, visualization, and classification with Emergent SOM
ESOM-Maps: tools for clustering, visualization, and classification with Emergent SOM Alfred Ultsch, Fabian Mörchen Data Bionics Research Group, University of Marburg D-35032 Marburg, Germany March 17,
More informationarxiv: v1 [physics.data-an] 27 Sep 2007
Classification of Interest Rate Curves Using Self-Organising Maps arxiv:0709.4401v1 [physics.data-an] 27 Sep 2007 M.Kanevski a,, M.Maignan b, V.Timonin a,1, A.Pozdnoukhov a,1 a Institute of Geomatics and
More informationMachine Learning Based Autonomous Network Flow Identifying Method
Machine Learning Based Autonomous Network Flow Identifying Method Hongbo Shi 1,3, Tomoki Hamagami 1,3, and Haoyuan Xu 2,3 1 Division of Physics, Electrical and Computer Engineering, Graduate School of
More informationVisualizing Changes in Data Collections Using Growing Self-Organizing Maps *
Visualizing Changes in Data Collections Using Growing Self-Organizing Maps * Andreas Nürnberger and Marcin Detyniecki University of California at Berkeley EECS, Computer Science Division Berkeley, CA 94720,
More informationUniversity of Florida CISE department Gator Engineering. Clustering Part 4
Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of
More informationUnderstanding Clustering Supervising the unsupervised
Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data
More informationSOMSN: An Effective Self Organizing Map for Clustering of Social Networks
SOMSN: An Effective Self Organizing Map for Clustering of Social Networks Fatemeh Ghaemmaghami Research Scholar, CSE and IT Dept. Shiraz University, Shiraz, Iran Reza Manouchehri Sarhadi Research Scholar,
More informationNonlinear dimensionality reduction of large datasets for data exploration
Data Mining VII: Data, Text and Web Mining and their Business Applications 3 Nonlinear dimensionality reduction of large datasets for data exploration V. Tomenko & V. Popov Wessex Institute of Technology,
More informationSlide07 Haykin Chapter 9: Self-Organizing Maps
Slide07 Haykin Chapter 9: Self-Organizing Maps CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Introduction Self-organizing maps (SOM) is based on competitive learning, where output neurons compete
More informationTwo-step Modified SOM for Parallel Calculation
Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Petr Gajdoš and Pavel Moravec Petr Gajdoš and Pavel Moravec Department of Computer Science, FEECS, VŠB Technical
More informationImproving A Trajectory Index For Topology Conserving Mapping
Proceedings of the 8th WSEAS Int. Conference on Automatic Control, Modeling and Simulation, Prague, Czech Republic, March -4, 006 (pp03-08) Improving A Trajectory Index For Topology Conserving Mapping
More informationFunction approximation using RBF network. 10 basis functions and 25 data points.
1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data
More informationAdvanced visualization of Self-Organizing. Maps with Vector Fields
Advanced visualization of Self-Organizing Maps with Vector Fields Georg Pölzlbauer a Michael Dittenbach b Andreas Rauber a,b a Department of Software Technology, Vienna University of Technology, Favoritenstrasse
More informationMSA220 - Statistical Learning for Big Data
MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups
More informationSupervised vs.unsupervised Learning
Supervised vs.unsupervised Learning In supervised learning we train algorithms with predefined concepts and functions based on labeled data D = { ( x, y ) x X, y {yes,no}. In unsupervised learning we are
More informationGradient visualization of grouped component planes on the SOM lattice
Gradient visualization of grouped component planes on the SOM lattice Gradient visualization of grouped component planes on the SOM lattice Georg Pölzlbauer 1, Michael Dittenbach 2, Andreas Rauber 1 1
More informationArtificial Neural Networks Unsupervised learning: SOM
Artificial Neural Networks Unsupervised learning: SOM 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001
More informationChapter 7: Competitive learning, clustering, and self-organizing maps
Chapter 7: Competitive learning, clustering, and self-organizing maps António R. C. Paiva EEL 6814 Spring 2008 Outline Competitive learning Clustering Self-Organizing Maps What is competition in neural
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationData Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Exploratory data analysis tasks Examine the data, in search of structures
More informationCHAPTER FOUR NEURAL NETWORK SELF- ORGANIZING MAP
96 CHAPTER FOUR NEURAL NETWORK SELF- ORGANIZING MAP 97 4.1 INTRODUCTION Neural networks have been successfully applied by many authors in solving pattern recognition problems. Unsupervised classification
More informationBringing Mobile Map Based Access to Digital Audio to the End User
Bringing Mobile Map Based Access to Digital Audio to the End User Robert Neumayer, Jakob Frank, Peter Hlavac, Thomas Lidy and Andreas Rauber Vienna University of Technology Department of Software Technology
More informationUnsupervised learning
Unsupervised learning Enrique Muñoz Ballester Dipartimento di Informatica via Bramante 65, 26013 Crema (CR), Italy enrique.munoz@unimi.it Enrique Muñoz Ballester 2017 1 Download slides data and scripts:
More information/00/$10.00 (C) 2000 IEEE
A SOM based cluster visualization and its application for false coloring Johan Himberg Helsinki University of Technology Laboratory of Computer and Information Science P.O. Box 54, FIN-215 HUT, Finland
More informationClustering Algorithms for general similarity measures
Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative
More informationSeismic facies analysis using generative topographic mapping
Satinder Chopra + * and Kurt J. Marfurt + Arcis Seismic Solutions, Calgary; The University of Oklahoma, Norman Summary Seismic facies analysis is commonly carried out by classifying seismic waveforms based
More informationInnovative User Interfaces for Accessing Music Libraries on Mobile Devices
Innovative User Interfaces for Accessing Music Libraries on Mobile Devices A SOM Based Music Browser for Mobile Devices Peter Hlavac Department of Software Technology, Vienna University of Technology Favoritenstrasse
More informationMethods for Intelligent Systems
Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering
More informationMUSI-6201 Computational Music Analysis
MUSI-6201 Computational Music Analysis Part 9.2: Music Similarity and Mood Recognition alexander lerch November 11, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood
More informationProject Participants
Annual Report for Period:10/2004-10/2005 Submitted on: 06/21/2005 Principal Investigator: Yang, Li. Award ID: 0414857 Organization: Western Michigan Univ Title: Projection and Interactive Exploration of
More informationKnowledge Based Document Management System for Free-Text Documents Discovery
Knowledge Based Document Management System for Free-Text Documents Discovery 1 Paul D Manuel 2, Mostafa Ibrahim Abd-El Barr 3, S. Thamarai Selvi 4 2 Department of Information Science, College for Women
More informationFrom Improved Auto-taggers to Improved Music Similarity Measures
From Improved Auto-taggers to Improved Music Similarity Measures Klaus Seyerlehner 1, Markus Schedl 1, Reinhard Sonnleitner 1, David Hauger 1, and Bogdan Ionescu 2 1 Johannes Kepler University Department
More informationSOM+EOF for Finding Missing Values
SOM+EOF for Finding Missing Values Antti Sorjamaa 1, Paul Merlin 2, Bertrand Maillet 2 and Amaury Lendasse 1 1- Helsinki University of Technology - CIS P.O. Box 5400, 02015 HUT - Finland 2- Variances and
More informationUnsupervised Learning
Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,
More informationProcess. Measurement vector (Feature vector) Map training and labeling. Self-Organizing Map. Input measurements 4. Output measurements.
Analysis of Complex Systems using the Self-Organizing Map Esa Alhoniemi, Olli Simula and Juha Vesanto Helsinki University of Technology Laboratory of Computer and Information Science P.O. Box 2200, FIN-02015
More informationLecture Topic Projects
Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, basic tasks, data types 3 Introduction to D3, basic vis techniques for non-spatial data Project #1 out 4 Data
More informationData analysis and inference for an industrial deethanizer
Data analysis and inference for an industrial deethanizer Francesco Corona a, Michela Mulas b, Roberto Baratti c and Jose Romagnoli d a Dept. of Information and Computer Science, Helsinki University of
More informationData Exploration with PCA and Unsupervised Learning with Clustering Paul Rodriguez, PhD PACE SDSC
Data Exploration with PCA and Unsupervised Learning with Clustering Paul Rodriguez, PhD PACE SDSC Clustering Idea Given a set of data can we find a natural grouping? Essential R commands: D =rnorm(12,0,1)
More informationTime Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks
Series Prediction as a Problem of Missing Values: Application to ESTSP7 and NN3 Competition Benchmarks Antti Sorjamaa and Amaury Lendasse Abstract In this paper, time series prediction is considered as
More informationDecision Manifolds: Classification Inspired by Self-Organization
Decision Manifolds: Classification Inspired by Self-Organization Georg Pölzlbauer, Thomas Lidy, Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology Favoritenstr.
More informationCluster analysis of 3D seismic data for oil and gas exploration
Data Mining VII: Data, Text and Web Mining and their Business Applications 63 Cluster analysis of 3D seismic data for oil and gas exploration D. R. S. Moraes, R. P. Espíndola, A. G. Evsukoff & N. F. F.
More informationCS570: Introduction to Data Mining
CS570: Introduction to Data Mining Scalable Clustering Methods: BIRCH and Others Reading: Chapter 10.3 Han, Chapter 9.5 Tan Cengiz Gunay, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei.
More informationMACHINE LEARNING: CLUSTERING, AND CLASSIFICATION. Steve Tjoa June 25, 2014
MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION Steve Tjoa kiemyang@gmail.com June 25, 2014 Review from Day 2 Supervised vs. Unsupervised Unsupervised - clustering Supervised binary classifiers (2 classes)
More informationOrganizing and Visualizing Software Repositories Using the Growing Hierarchical Self-Organizing Map
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 22, 283-295 (2006) Organizing and Visualizing Software Repositories Using the Growing Hierarchical Self-Organizing Map SONGSRI TANGSRIPAIROJ AND M. H. SAMADZADEH
More informationPoints Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked
Plotting Menu: QCExpert Plotting Module graphs offers various tools for visualization of uni- and multivariate data. Settings and options in different types of graphs allow for modifications and customizations
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationCluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1
Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods
More informationDATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm
DATA MINING LECTURE 7 Hierarchical Clustering, DBSCAN The EM Algorithm CLUSTERING What is a Clustering? In general a grouping of objects such that the objects in a group (cluster) are similar (or related)
More informationMeasure of Distance. We wish to define the distance between two objects Distance metric between points:
Measure of Distance We wish to define the distance between two objects Distance metric between points: Euclidean distance (EUC) Manhattan distance (MAN) Pearson sample correlation (COR) Angle distance
More informationUsing Self-Organizing Maps for Sentiment Analysis. Keywords Sentiment Analysis, Self-Organizing Map, Machine Learning, Text Mining.
Using Self-Organizing Maps for Sentiment Analysis Anuj Sharma Indian Institute of Management Indore 453331, INDIA Email: f09anujs@iimidr.ac.in Shubhamoy Dey Indian Institute of Management Indore 453331,
More informationSelf-organization of very large document collections
Chapter 10 Self-organization of very large document collections Teuvo Kohonen, Samuel Kaski, Krista Lagus, Jarkko Salojärvi, Jukka Honkela, Vesa Paatero, Antti Saarela Text mining systems are developed
More informationComponent Selection for the Metro Visualisation of the Self-Organising Map
Component Selection for the Metro Visualisation of the Self-Organising Map Robert Neumayer, Rudolf Mayer, and Andreas Rauber Vienna University of Technology, Department of Software Technology and Interactive
More informationClustering & Classification (chapter 15)
Clustering & Classification (chapter 5) Kai Goebel Bill Cheetham RPI/GE Global Research goebel@cs.rpi.edu cheetham@cs.rpi.edu Outline k-means Fuzzy c-means Mountain Clustering knn Fuzzy knn Hierarchical
More informationClustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation
More informationUnsupervised Learning
Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationCOMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS
COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS Toomas Kirt Supervisor: Leo Võhandu Tallinn Technical University Toomas.Kirt@mail.ee Abstract: Key words: For the visualisation
More informationCS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003
CS 664 Slides #11 Image Segmentation Prof. Dan Huttenlocher Fall 2003 Image Segmentation Find regions of image that are coherent Dual of edge detection Regions vs. boundaries Related to clustering problems
More informationBackground. Parallel Coordinates. Basics. Good Example
Background Parallel Coordinates Shengying Li CSE591 Visual Analytics Professor Klaus Mueller March 20, 2007 Proposed in 80 s by Alfred Insellberg Good for multi-dimensional data exploration Widely used
More informationTypes of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters
Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive
More informationClustering. Lecture 6, 1/24/03 ECS289A
Clustering Lecture 6, 1/24/03 What is Clustering? Given n objects, assign them to groups (clusters) based on their similarity Unsupervised Machine Learning Class Discovery Difficult, and maybe ill-posed
More informationMineral Exploation Using Neural Netowrks
ABSTRACT I S S N 2277-3061 Mineral Exploation Using Neural Netowrks Aysar A. Abdulrahman University of Sulaimani, Computer Science, Kurdistan Region of Iraq aysser.abdulrahman@univsul.edu.iq Establishing
More informationLine Simplification Using Self-Organizing Maps
Line Simplification Using Self-Organizing Maps Bin Jiang Division of Geomatics, Dept. of Technology and Built Environment, University of Gävle, Sweden. Byron Nakos School of Rural and Surveying Engineering,
More informationClustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search
Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2
More informationECS 234: Data Analysis: Clustering ECS 234
: Data Analysis: Clustering What is Clustering? Given n objects, assign them to groups (clusters) based on their similarity Unsupervised Machine Learning Class Discovery Difficult, and maybe ill-posed
More informationClustering algorithms and introduction to persistent homology
Foundations of Geometric Methods in Data Analysis 2017-18 Clustering algorithms and introduction to persistent homology Frédéric Chazal INRIA Saclay - Ile-de-France frederic.chazal@inria.fr Introduction
More informationClustering Algorithms for Data Stream
Clustering Algorithms for Data Stream Karishma Nadhe 1, Prof. P. M. Chawan 2 1Student, Dept of CS & IT, VJTI Mumbai, Maharashtra, India 2Professor, Dept of CS & IT, VJTI Mumbai, Maharashtra, India Abstract:
More informationClustering and Visualisation of Data
Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some
More informationCluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6
Cluster Analysis and Visualization Workshop on Statistics and Machine Learning 2004/2/6 Outlines Introduction Stages in Clustering Clustering Analysis and Visualization One/two-dimensional Data Histogram,
More informationAn Approach to Automatically Tracking Music Preference on Mobile Players
An Approach to Automatically Tracking Music Preference on Mobile Players Tim Pohle, 1 Klaus Seyerlehner 1 and Gerhard Widmer 1,2 1 Department of Computational Perception Johannes Kepler University Linz,
More informationMachine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016
Machine Learning for Signal Processing Clustering Bhiksha Raj Class 11. 13 Oct 2016 1 Statistical Modelling and Latent Structure Much of statistical modelling attempts to identify latent structure in the
More information