Package mmtsne. July 28, 2017

Similar documents
Package Rtsne. April 14, 2017

Courtesy of Prof. Shixia University

Package projector. February 27, 2018

Package SSLASSO. August 28, 2018

Package spark. July 21, 2017

Package RandPro. January 10, 2018

Package robustreg. R topics documented: April 27, Version Date Title Robust Regression Functions

Package automl. September 13, 2018

Package hsiccca. February 20, 2015

Package sensory. R topics documented: February 23, Type Package

Package crossrun. October 8, 2018

Package CuCubes. December 9, 2016

Package MfUSampler. June 13, 2017

Package LVGP. November 14, 2018

Package bayeslongitudinal

Package rnn. R topics documented: June 21, Title Recurrent Neural Network Version 0.8.1

Advanced Data Visualization

Package kirby21.base

Package datasets.load

Package CatEncoders. March 8, 2017

Package ssa. July 24, 2016

Package gibbs.met. February 19, 2015

Package geogrid. August 19, 2018

Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package rpst. June 6, 2017

Package IATScore. January 10, 2018

Package bisect. April 16, 2018

Package missforest. February 15, 2013

Visualizing the quality of dimensionality reduction

Package fastdummies. January 8, 2018

Package densityclust

Package tidyimpute. March 5, 2018

Package balance. October 12, 2018

Package hiernet. March 18, 2018

Package estprod. May 2, 2018

Package readobj. November 2, 2017

Package PrivateLR. March 20, 2018

Package mgc. April 13, 2018

Package gsalib. R topics documented: February 20, Type Package. Title Utility Functions For GATK. Version 2.1.

Package packcircles. April 28, 2018

Package nonlinearicp

Package mrm. December 27, 2016

Package sgmcmc. September 26, Type Package

Package glinternet. June 15, 2018

Package TANDEM. R topics documented: June 15, Type Package

Package ADMM. May 29, 2018

Package PDN. November 4, 2017

Package optimus. March 24, 2017

Package libstabler. June 1, 2017

Package kerasformula

Package TVsMiss. April 5, 2018

Package SFtools. June 28, 2017

Package naivebayes. R topics documented: January 3, Type Package. Title High Performance Implementation of the Naive Bayes Algorithm

Package lumberjack. R topics documented: July 20, 2018

Package BayesLogit. May 6, 2012

arxiv: v1 [cs.cl] 18 Jan 2015

Package DRaWR. February 5, 2016

Package crossword.r. January 19, 2018

Effects of sparseness and randomness of pairwise distance matrix on t-sne results

Package metaheur. June 30, 2016

Package filling. December 11, 2017

Package MTLR. March 9, 2019

Package texteffect. November 27, 2017

Package PUlasso. April 7, 2018

CSE 6242 A / CX 4242 DVA. March 6, Dimension Reduction. Guest Lecturer: Jaegul Choo

Package nprotreg. October 14, 2018

Package vipor. March 22, 2017

Package PTE. October 10, 2017

Package jointdiag. September 9, 2017

Package TestDataImputation

Package svalues. July 15, 2018

Package rucrdtw. October 13, 2017

Package ANOVAreplication

Package spatialtaildep

Package RAMP. May 25, 2017

Package sigqc. June 13, 2018

Feature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22

m-tsne: A Framework for Visualizing High-Dimensional Multivariate Time Series

Package manet. September 19, 2017

Package ECctmc. May 1, 2018

Package FisherEM. February 19, 2015

Package deductive. June 2, 2017

Package FPDclustering

Package humanize. R topics documented: April 4, Version Title Create Values for Human Consumption

Package clusternomics

Package clipr. June 23, 2018

Package GenSA. R topics documented: January 17, Type Package Title Generalized Simulated Annealing Version 1.1.

Package IntNMF. R topics documented: July 19, 2018

Package OLScurve. August 29, 2016

Package sbrl. R topics documented: July 29, 2016

Linear discriminant analysis and logistic

Package GGMselect. February 1, 2018

Package bayescl. April 14, 2017

Package sglr. February 20, 2015

Package raker. October 10, 2017

Package repec. August 31, 2018

Package graddescent. January 25, 2018

Package merror. November 3, 2015

Visualizing Data using t-sne

Package promote. November 15, 2017

Transcription:

Type Package Title Multiple Maps t-sne Author Benjamin J. Radford Package mmtsne July 28, 2017 Maintainer Benjamin J. Radford <benjamin.radford@gmail.com> Version 0.1.0 An implementation of multiple maps t-distributed stochastic neighbor embedding (t-sne). Multiple maps t-sne is a method for projecting high-dimensional data into several low-dimensional maps such that non-metric space properties are better preserved than they would be by a single map. Multiple maps t-sne with only one map is equivalent to standard t-sne. When projecting onto more than one map, multiple maps t-sne estimates a set of latent weights that allow each point to contribute to one or more maps depending on similarity relationships in the original data. This implementation is a port of the original 'Matlab' library by Laurens van der Maaten. See Van der Maaten and Hinton (2012) <doi:10.1007/s10994-011-5273-4>. This material is based upon work supported by the United States Air Force and Defense Advanced Research Project Agency (DARPA) under Contract No. FA8750-17-C-0020. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Air Force and Defense Advanced Research Projects Agency. Distribution Statement A: Approved for Public Release; Distribution Unlimited. License FreeBSD file LICENSE LazyData TRUE RoxygenNote 6.0.1 NeedsCompilation no Repository CRAN Date/Publication 2017-07-28 09:09:42 UTC R topics documented: hbeta............................................. 2 1

2 mmtsne mmtsne........................................... 2 mmtsnep.......................................... 4 p2sp............................................. 5 x2p............................................. 6 Index 7 hbeta Estimate perplexity and probability values hbeta returns the perplexity and probability values for a row of data D. hbeta(d, beta = 1) D beta A distance vector. A constant scalar. mmtsne Multiple maps t-sne mmtsne estimates a multiple maps t-distributed stochastic neighbor embedding (multiple maps t- SNE) model. mmtsne(x, no_maps = 1, no_dims = 2, perplexity = 30, max_iter = 500, momentum = 0.5, final_momentum = 0.8, mom_switch_iter = 250, eps = 1e-07) X no_maps A dataframe or matrix of N rows and D columns. The number of maps (positive whole number) to be estimated. no_dims The number of dimensions per map. Typical values are 2 or 3. perplexity The target perplexity for probability matrix construction. Commonly recommended values range from 5 to 30. Perplexity roughly corresponds to the expected number of neighbors per data point.

mmtsne 3 max_iter momentum The number of iterations to run. Constant scaling factor for update momentum in gradient descent algorithm. final_momentum Constant scaling factor for update momentum in gradient descent algorithm after the momentum switch point. mom_switch_iter The iteration at which momentum switches from momentum to final_momentum. eps Details A small positive value near zero. mmtsne is a wrapper that performs multiple maps t-sne on an input dataset, X. The function will pre-process X, an N by D matrix or dataframe, then call mmtsnep. The pre-processing steps include calls to x2p and p2sp to convert X into an N by N symmetrical joint probability matrix. The mmtnsep code is an almost direct port of the original multiple maps t-sne Matlab code by van der Maaten and Hinton (2012). mmtsne estimates a multidimensional array of N x no_dims x no_maps. Each map is an N x no_dims matrix of estimated t-sne coordinates. When no_maps=1, multiple maps t-sne reduces to standard t-sne. A list that includes the following objects: Y An N x no_dims x no_maps array of predicted coordinates. weights An N x no_maps matrix of unscaled weights. A high weight on entry i, j indicates a proportions An N x no_maps matrix of scaled weights. A high weight on entry i, j indicates a References L.J.P. van der Maaten and G.E. Hinton. Visualizing Non-Metric Similarities in Multiple Maps. Machine Learning 87(1):33-55, 2012. PDF. Examples # Load the iris dataset data("iris") # Estimate a mmtsne model with 2 maps, 2 dimensions each model <- mmtsne(iris[,1:4], no_maps=2, max_iter=100) # Plot the results side-by-side for inspection # Points scaled by map proportion weights plus constant factor par(mfrow=c(1,2)) plot(model$y[,,1], col=iris$species, cex=model$proportions[,1] +.2) plot(model$y[,,2], col=iris$species, cex=model$proportions[,2] +.2) par(mfrow=c(1,1))

4 mmtsnep mmtsnep Multiple maps t-sne with symmetric probability matrix mmtsnep estimates a multiple maps t-distributed stochastic neighbor embedding (multiple maps t-sne) model. mmtsnep(p, no_maps, no_dims = 2, max_iter = 500, momentum = 0.5, final_momentum = 0.8, mom_switch_iter = 250, eps = 1e-07) P no_maps An N xn symmetric joint probability distribution matrix. These can be constructed from an N by D matrix with x2p and p2sp. Alternatively, the wrapper function mmtsne will wrap the matrix construction and multiple maps t-sne model estimation into a single step. The number of maps (positive whole number) to be estimated. no_dims The number of dimensions per map. Typical values are 2 or 3. max_iter momentum The number of iterations to run. Constant scaling factor for update momentum in gradient descent algorithm. final_momentum Constant scaling factor for update momentum in gradient descent algorithm after the momentum switch point. mom_switch_iter The iteration at which momentum switches from momentum to final_momentum. eps Details A small positive value near zero. This code is an almost direct port of the original multiple maps t-sne Matlab code by van der Maaten and Hinton (2012). mmtsne estimates a multidimensional array of N x no_dims x no_maps. Each map is an N x no_dims matrix of estimated t-sne coordinates. When no_maps=1, multiple maps t-sne reduces to standard t-sne. A list that includes the following objects: Y An N x no_dims x no_maps array of predicted coordinates. weights An N x no_maps matrix of unscaled weights. A high weight on entry i, j indicates a proportions An N x no_maps matrix of scaled weights. A high weight on entry i, j indicates a

p2sp 5 References L.J.P. van der Maaten and G.E. Hinton. Visualizing Non-Metric Similarities in Multiple Maps. Machine Learning 87(1):33-55, 2012. PDF. Examples # Load the iris dataset data("iris") # Produce a symmetric joint probability matrix prob_matrix <- p2sp(x2p(as.matrix(iris[,1:4]))) # Estimate a mmtsne model with 2 maps, 2 dimensions each model <- mmtsnep(prob_matrix, no_maps=2, max_iter=100) # Plot the results side-by-side for inspection # Points scaled by map proportion weights plus constant factor par(mfrow=c(1,2)) plot(model$y[,,1], col=iris$species, cex=model$proportions[,1] + 0.2) plot(model$y[,,2], col=iris$species, cex=model$proportions[,2] + 0.2) par(mfrow=c(1,1)) p2sp Probability matrix to symmetric probability matrix p2sp returns a symmetrical pair-wise joint probability matrix given an input probability matrix P. p2sp(p) P An N x N probability matrix, like those produced by x2p An N x N symmetrical matrix of pair-wise probabilities.

6 x2p x2p Data to probability matrix x2p returns a pair-wise conditional probability matrix given an input matrix X. x2p(x, perplexity = 30, tol = 1e-05) X perplexity tol A data matrix with N rows. The target perplexity. s between 5 and 50 are generally considered appropriate. Loosely translates into the expected number of neighbors per point. A small positive value. Details This function is an almost direct port of the original Python implementation by van der Maaten and Hinton (2008). It uses a binary search to estimate probability values for all pairwise-elements of X. The conditional Gaussian distributions should all be of equal perplexity. An N x N matrix of pair-wise probabilities. References L.J.P. van der Maaten and G.E. Hinton. Visualizing High-Dimensional Data Using t-sne. Journal of Machine Learning Research 9(Nov):2579-2605, 2008. PDF.

Index hbeta, 2 mmtsne, 2, 4 mmtsnep, 3, 4 p2sp, 3, 4, 5 x2p, 3 5, 6 7