The som Package. September 19, Description Self-Organizing Map (with application in gene clustering)

Similar documents
Contents 1 General 3 2 The principle of the SOM 4 3 Practical advices for the construction of good maps 6 4 Installation of the program package

Package compeir. February 19, 2015

Package sspline. R topics documented: February 20, 2015

The nor1mix Package. August 3, 2006

The wccsom Package. February 10, 2006

Package Rramas. November 25, 2017

How to convert a numeric value into English words in Excel

The nor1mix Package. June 12, 2007

The bnlearn Package. September 21, 2007

Package FlowSOM. June 19, 2018

Supervised vs.unsupervised Learning

Package gains. September 12, 2017

Package JBTools. R topics documented: June 2, 2015

Package Daim. February 15, 2013

A Comparison of Word Frequency and N-Gram Based Vulnerability Categorization Using SOM

Package rknn. June 9, 2015

Package Rsomoclu. January 3, 2019

Package qrfactor. February 20, 2015

Ratios can be written in several different ways. Using a colon. Using the word to. Or as a fraction.

Package FlowSOM. April 14, 2017

Package dyncorr. R topics documented: December 10, Title Dynamic Correlation Package Version Date

Package MixSim. April 29, 2017

Package Numero. November 24, 2018

The sspline Package. October 11, 2007

Package macorrplot. R topics documented: October 2, Title Visualize artificial correlation in microarray data. Version 1.50.

Statistics 251: Statistical Methods

Package munfold. R topics documented: February 8, Type Package. Title Metric Unfolding. Version Date Author Martin Elff

Package longclust. March 18, 2018

Stat 290: Lab 2. Introduction to R/S-Plus

Package ic10. September 23, 2015

Package listdtr. November 29, 2016

1 Case study of SVM (Rob)

Matrix algebra. Basics

Package DTRlearn. April 6, 2018

Package Mondrian. R topics documented: March 4, Type Package

Package tclust. May 24, 2018

Package cycle. March 30, 2019

Artificial Neural Networks Unsupervised learning: SOM

Package mlegp. April 15, 2018

Package MmgraphR. R topics documented: September 14, Type Package

Double Self-Organizing Maps to Cluster Gene Expression Data

Package svmpath. R topics documented: August 30, Title The SVM Path Algorithm Date Version Author Trevor Hastie

Package EnQuireR. R topics documented: February 19, Type Package Title A package dedicated to questionnaires Version 0.

Practice for Learning R and Learning Latex

Package ConvergenceClubs

Package fso. February 19, 2015

Package mmeta. R topics documented: March 28, 2017

Package generxcluster

Basic matrix math in R

Today. Gradient descent for minimization of functions of real variables. Multi-dimensional scaling. Self-organizing maps

Package bdots. March 12, 2018

Package feature. R topics documented: July 8, Version Date 2013/07/08

Package radiomics. March 30, 2018

The Hitchhiker s Guide to TensorFlow:

PROMO 2017a - Tutorial

Figure (5) Kohonen Self-Organized Map

Package intccr. September 12, 2017

Place Value. Unit 1 Lesson 1

Package mpm. February 20, 2015

Package pvclust. October 23, 2015

Package CausalImpact

Package panelview. April 24, 2018

Package qvcalc. R topics documented: September 19, 2017

Package feature. R topics documented: October 26, Version Date

The supclust Package

17. [Exploring Numbers]

Package ipft. January 4, 2018

Package replicatedpp2w

Package edci. May 16, 2018

Package NGScopy. April 10, 2015

Package corclass. R topics documented: January 20, 2016

Package CINID. February 19, 2015

Package gppm. July 5, 2018

Package splicegear. R topics documented: December 4, Title splicegear Version Author Laurent Gautier

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo

Clustering and Visualisation of Data

Package areaplot. October 18, 2017

Package pomdp. January 3, 2019

> glucose = c(81, 85, 93, 93, 99, 76, 75, 84, 78, 84, 81, 82, 89, + 81, 96, 82, 74, 70, 84, 86, 80, 70, 131, 75, 88, 102, 115, + 89, 82, 79, 106)

Package GLDreg. February 28, 2017

Package mixphm. July 23, 2015

Package GateFinder. March 17, 2019

Package spikes. September 22, 2016

Package Mfuzz. R topics documented: March 26, Version Date Title Soft clustering of time series gene expression data

Package binmto. February 19, 2015

Exploring and Understanding Data Using R.

Package beast. March 16, 2018

Package restlos. June 18, 2013

Package beanplot. R topics documented: February 19, Type Package

limma: A brief introduction to R

Package bayesdp. July 10, 2018

Package sparsereg. R topics documented: March 10, Type Package

Package complmrob. October 18, 2015

The binmto Package. August 25, 2007

Package uclaboot. June 18, 2003

Package tpr. R topics documented: February 20, Type Package. Title Temporal Process Regression. Version

Package esabcv. May 29, 2015

Package bayescl. April 14, 2017

Clustering with Reinforcement Learning

Package r.jive. R topics documented: April 12, Type Package

Transcription:

The som Package September 19, 2004 Version 0.3-4 Date 2004-09-18 Title Self-Organizing Map Author Maintainer Depends R (>= 1.9.0) Self-Organizing Map (with application in gene clustering) License GPL version 2 or later R topics documented: filtering......................................... 1 normalize........................................ 2 plot.som......................................... 3 qerror.......................................... 4 som-internal....................................... 4 som........................................... 5 summary.som...................................... 7 yeast........................................... 7 Index 9 filtering Filter data before feeding som algorithm for gene expression data Filtering data by certain floor, ceiling, max/min ratio, and max - min difference. filtering(x, lt=20, ut=16000, mmr=3, mmd=200) 1

2 normalize x lt ut mmr mmd a data frame or matrix of input data. floor value replaces those less than it with the value ceiling value replaced those greater than it with the value the max/min ratio, rows with max/min < mmr will be removed the max - min difference, rows with (max - min) < mmd will be removed Value An dataframe or matrix after the filtering See Also normalize. normalize normalize data before feeding som algorithm Normalize the data so that each row has mean 0 and variance 1. normalize(x, byrow=true) x byrow a data frame or matrix of input data. whether normalizing by row or by column, default is byrow. Value An dataframe or matrix after the normalizing. See Also filtering.

plot.som 3 plot.som Visualizing a SOM Plot the SOM in a 2-dim map with means and sd bars. ## S3 method for class 'som': plot(x, sdbar=1, ylim=c(-3, 3), color=true, ntik=3, yadj=0.1, xlab="", ylab="",...) x a som object sdbar the length of sdbar in sd, no sdbar if sdbar=0 ylim the range of y axies in each cell of the map color whether or not use color plotting ntik the number of tiks of the vertical axis yadj the proportion used to put the number of obs xlab x label ylab y label... other options to plot Note This function is not cleanly written. The original purpose was to mimic what GENECLUS- TER does. The ylim is hardcoded so that only standardized data could be properly plotted. There are visualization methods like umat and sammon in SOM PAK3.1, but not implemented here. Examples foo <- som(matrix(rnorm(1000), 250), 3, 5) plot(foo, ylim=c(-1, 1))

4 som-internal qerror quantization accuracy get the average distortion measure qerror(obj, err.radius=1) obj err.radius a som object radius used calculating qerror Value An average of the following quantity (weighted distance measure) over all x in the sample, x mi h ci where h ci is the neighbourhood kernel for the ith code. Examples foo <- som(matrix(rnorm(1000), 100), 2, 4) qerror(foo, 3) som-internal Internal som functions Internal som Functions. ciplot(x, y, se, n=1, d, ywindow) inrange(x, xlim) plotcell(x, y, dat, code, n, sdbar=1, ylim, yadj) somgrids(xdim, ydim, color, yadj=0.1, hexa, ntik, ylim) somsum(obj) som.par(obj) Details These are not to be called directly by the user.

som 5 som Function to train a Self-Organizing Map Produces an object of class som which is a Self-Organizing Map fit of the data. som.init(data, xdim, ydim, init="linear") som(data, xdim, ydim, init="linear", alpha=null, alphatype="inverse", neigh="gaussian", topol="rect", radius=null, rlen=null, err.radius=1, inv.alp.c=null) som.train(data, code, xdim, ydim, alpha=null, alphatype="inverse", neigh="gaussian", topol="rect", radius=null, rlen=null, err.radius=1, inv.alp.c=null) som.update(obj, alpha = NULL, radius = NULL, rlen = NULL, err.radius = 1, inv.alp.c = NULL) som.project(obj, newdat) obj newdat code data xdim ydim init alpha alphatype neigh topol radius rlen err.radius a som object. a new dataset needs to be projected onto the map. a matrix of initial code vector in the map. a data frame or matrix of input data. an integer specifying the x-dimension of the map. an integer specifying the y-dimension of the map. a character string specifying the initializing method. The following are permitted: "sample" uses a radom sample from the data; "random" uses random draws from N(0,1); "linear" uses the linear grids upon the first two principle components directin. a vector of initial learning rate parameter for the two training phases. Decreases linearly to zero during training. a character string specifying learning rate funciton type. Possible choices are linear function ("linear") and inverse-time type function ("inverse"). a character string specifying the neighborhood function type. The following are permitted: "bubble" "gaussian" a character string specifying the topology type when measuring distance in the map. The following are permitted: "hexa" "rect" a vector of initial radius of the training area in som-algorithm for the two training phases. Decreases linearly to one during training. a vector of running length (number of steps) in the two training phases. a numeric value specifying the radius when calculating average distortion measure. inv.alp.c the constant C in the inverse learning rate function: alpha0 * C / (C + t);

6 som Value som.init initializes a map and returns the code matrix. som does the two-step som training in a batch fashion and return a som object. som.train takes data, code, and traing parameters and perform the requested som training. som.update takes a som object and further train it with updated paramters. som.project projects new data onto the map. An object of class "som" representing the fit, which is a list containing the following components: data init xdim ydim code visual alpha0 alpha neigh topol radius0 rlen qerror code.sum the dataset on which som was applied. a character string indicating the initializing method. an integer specifying the x-dimension of the map. an integer specifying the y-dimension of the map. a metrix with nrow = xdim*ydim, each row corresponding to a code vector of a cell in the map. The mapping from cell coordinate (x, y) to the row index in the code matrix is: rownumber = x + y * xdim a data frame of three columns, with the same number of rows as in data: x and y are the coordinate of the corresponding observation in the map, and qerror is the quantization error computed as the squared distance (depends topol) between the observation vector and its coding vector. a vector of initial learning rate parameter for the two training phases. a character string specifying learning rate funciton type. a character string specifying the neighborhood function type. a character string specifying the topology type when measuring distance in the map. a vector of initial radius of the training area in som-algorithm for the two training phases. a vector of running length in the two training phases. a numeric value of average distortion measure. a dataframe summaries the number of observations in each map cell. References Kohonen, Hynninen, Kangas, and Laaksonen (1995), SOM-PAK, the Self-Organizing Map Program Package (version 3.1). http://www.cis.hut.fi/research/papers/som tr96.ps.z Examples data(yeast) yeast <- yeast[, -c(1, 11)] yeast.f <- filtering(yeast) yeast.f.n <- normalize(yeast.f) foo <- som(yeast.f.n, xdim=5, ydim=6) foo <- som(yeast.f.n, xdim=5, ydim=6, topol="hexa", neigh="gaussian") plot(foo)

summary.som 7 summary.som summarize a som object print out the configuration parameters of a som object ## S3 method for class 'som': summary(object,...) ## S3 method for class 'som': print(x,...) object, x a som object... nothing yet yeast yeast cell cycle The yeast data frame has 6601 rows and 18 columns, i.e., 6601 genes, measured at 18 time points. data(yeast) Format This data frame contains the following columns: Gene a character vector of gene names zero a numeric vector ten a numeric vector twenty a numeric vector thirty a numeric vector fourty a numeric vector fifty a numeric vector sixty a numeric vector seventy a numeric vector

8 yeast Source eighty a numeric vector ninety a numeric vector hundred a numeric vector one.ten a numeric vector one.twenty a numeric vector one.thirty a numeric vector one.fourty a numeric vector one.fifty a numeric vector one.sixty a numeric vector http://genomics.stanford.edu References Tamayo et. al. (1999), Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation, PNAS V96, pp2907-2912, March 1999.

Index Topic arith qerror, 3 Topic cluster som, 4 Topic datasets yeast, 7 Topic hplot plot.som, 2 Topic internal som-internal, 4 Topic manip filtering, 1 normalize, 2 Topic print summary.som, 6 ciplot (som-internal), 4 filtering, 1, 2 inrange (som-internal), 4 normalize, 2, 2 plot.som, 2 plotcell (som-internal), 4 print.som (summary.som), 6 qerror, 3 som, 4 som-internal, 4 som.par (som-internal), 4 somgrids (som-internal), 4 somsum (som-internal), 4 summary.som, 6 yeast, 7 9