Package Linnorm. October 12, 2016

Similar documents
Package Linnorm. July 11, 2018

Package ssizerna. January 9, 2017

Package SIMLR. December 1, 2017

Package RTCGAToolbox

Package tximport. November 25, 2017

Package tximport. May 13, 2018

Package RNAseqNet. May 16, 2017

Package scmap. March 29, 2019

Package ECctmc. May 1, 2018

Package igc. February 10, 2018

Package sigqc. June 13, 2018

Package SC3. September 29, 2018

Package EMDomics. November 11, 2018

Package RNASeqR. January 8, 2019

Package SC3. November 27, 2017

Package omicplotr. March 15, 2019

Package EventPointer

Package bnbc. R topics documented: April 1, Version 1.4.0

Package ABSSeq. September 6, 2018

Package semisup. March 10, Version Title Semi-Supervised Mixture Model

Package anota2seq. January 30, 2018

Package GeneNetworkBuilder

Package BgeeDB. January 5, 2019

Package M3Drop. August 23, 2018

Package attrcusum. December 28, 2016

Package milr. June 8, 2017

ROTS: Reproducibility Optimized Test Statistic

Package mfa. R topics documented: July 11, 2018

Package BDMMAcorrect

Package dupradar. R topics documented: July 12, Type Package

Package DRIMSeq. December 19, 2018

Package QUBIC. September 1, 2018

Package npseq. September 7, 2011

Automated Bioinformatics Analysis System on Chip ABASOC. version 1.1

Comparing methods for differential expression analysis of RNAseq data with the compcoder

Package SCAN.UPC. October 9, Type Package. Title Single-channel array normalization (SCAN) and University Probability of expression Codes (UPC)

Package RUVSeq. September 16, 2018

Package ERSSA. November 4, 2018

Package rqt. November 21, 2017

Package multihiccompare

Package GSVA. R topics documented: January 22, Version

Package CoGAPS. January 24, 2019

Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

PROPER: PROspective Power Evaluation for RNAseq

Package clusternomics

Package splinetimer. December 22, 2016

srap: Simplified RNA-Seq Analysis Pipeline

Package robustreg. R topics documented: April 27, Version Date Title Robust Regression Functions

Package ccmap. June 17, 2018

A review of RNA-Seq normalization methods

Package mgc. April 13, 2018

Package biosigner. March 6, 2019

Package Numero. November 24, 2018

Package reconstructr

Package varistran. July 25, 2016

Package mixsqp. November 14, 2018

Package TVsMiss. April 5, 2018

Package gpart. November 19, 2018

Package PedCNV. February 19, 2015

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Advanced analysis using bayseq; generic distribution definitions

Package JunctionSeq. November 16, 2015

Package validara. October 19, 2017

Package BASiNET. October 2, 2018

Package INCATome. October 5, 2017

Package DPBBM. September 29, 2016

Package TANDEM. R topics documented: June 15, Type Package

Package geecc. October 9, 2015

Package ClusterSignificance

Package PUlasso. April 7, 2018

Package pcr. November 20, 2017

Package metagene. November 1, 2018

Package quickreg. R topics documented:

Package MEAL. August 16, 2018

Package citools. October 20, 2018

CQN (Conditional Quantile Normalization)

Package epitab. July 4, 2018

Advanced RNA-Seq 1.5. User manual for. Windows, Mac OS X and Linux. November 2, 2016 This software is for research purposes only.

Package RTNduals. R topics documented: March 7, Type Package

Package clusterseq. R topics documented: June 13, Type Package

Package CountClust. May 10, 2018

Package geecc. R topics documented: December 7, Type Package

Package MTLR. March 9, 2019

Package strat. November 23, 2016

Package nmslibr. April 14, 2018

Package fastdummies. January 8, 2018

RNA-Seq. Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University

Package diffcyt. December 26, 2018

Package methyvim. April 5, 2019

Package twilight. August 3, 2013

Package JunctionSeq. November 1, 2018

Package HTSFilter. November 30, 2017

Package omu. August 2, 2018

Package nngeo. September 29, 2018

Package M3D. March 21, 2019

Package enrich. September 3, 2013

Package BEARscc. May 2, 2018

Package stattarget. December 5, 2017

Package roar. August 31, 2018

Package ramcmc. May 29, 2018

Transcription:

Type Package Package Linnorm October 12, 2016 Title Linear model and normality based transformation method (Linnorm) Version 1.0.6 Date 2016-08-27 Author Shun Hang Yip <shunyip@bu.edu>, Panwen Wang <pwwang@pwwang.com>, Jean-Pierre Kocher <Kocher.JeanPierre@mayo.edu>, Pak Chung Sham <pcsham@hku.hk>, Junwen Wang <junwen@uw.edu> Maintainer Ken Shun Hang Yip <shunyip@bu.edu> Description Linnorm is an R package for the analysis of RNA-seq, scrna-seq, ChIP-seq count data or any large scale count data. Its main function is to normalize and transform these datasets for parametric tests. Examples of parametric tests include using limma for differential expression analysis or differential peak detection, or calculating Pearson correlation coefficient for gene correlation study. Linnorm can work with raw count, CPM, RPKM, FPKM and TPM. Additionally, Linnorm provides the RnaXSim function for the simulation of RNA-seq raw counts for the evaluation of differential expression analysis methods. RnaXSim can simulate RNA-seq dataset in Gamma, Log Normal, Negative Binomial or Poisson distributions. License MIT + file LICENSE Imports Rcpp (>= 0.12.2), RcppArmadillo, MASS, limma, stats, utils, statmod LinkingTo Rcpp, RcppArmadillo Suggests BiocStyle, knitr, rmarkdown, seqc, gplots, RColorBrewer VignetteBuilder knitr biocviews Sequencing, ChIPSeq, RNASeq, DifferentialExpression, GeneExpression, Genetics, Normalization, Software, Transcription, BatchEffect NeedsCompilation yes URL http://www.jjwanglab.org/linnorm/ RoxygenNote 5.0.1 1

2 Linnorm R topics documented: Linnorm........................................... 2 Linnorm.limma....................................... 3 RnaXSim.......................................... 5 Index 7 Linnorm Linnorm Transformation Function Description This function performs the Linear model and normality based transformation method (Linnorm) for RNA-seq expression data or large scale count data. Usage Linnorm(datamatrix, showinfo = FALSE, method = "default", perturbation = 10, minzeroportion = 2/3) Arguments datamatrix showinfo method perturbation The matrix or data frame that contains your dataset. Each row is a feature (or Gene) and each column is a sample (or replicate). Undefined values such as NA are not supported. Logical. Show lambda value calculated. Defaults to FALSE. "default" or "lambda" The program will output the transformed matrix if the method is "default". If the method is "lambda", the program will output a lambda value. Integer >=2. To search for an optimal minimal deviation parameter (please see the article), Linnorm uses the iterated local search algorithm which perturbs away from the initial local minimum. The range of the area searched in each perturbation is exponentially increased as the area get further away from the initial local minimum, which is determined by their index. This range is calculated by 10 * (perturbation ^ index). minzeroportion Double >=0, <= 1. Featuress without at least this portion of non-zero values will not be used in the calculation of normalizing parameter. Defaults to 2/3. Details If method is default, Linnorm outputs a transformed expression matrix. For users who wish to work with lambda instead, the output is a single lambda value. Please note that users with the lambda value can obtain a transformed Linnorm dataset by: log1p(lambda * datamatrix). There is no need to rerun the program if a lambda is already calculated.

Linnorm.limma 3 Value This function returns either a transformed data matrix or a lambda value. Examples #Obtain example matrix: library(seqc) SampleA <- ILM_aceview_gene_BGI[,grepl("A_",colnames(ILM_aceview_gene_BGI))] rownames(samplea) <- ILM_aceview_gene_BGI[,2] #Extract a portion of the matrix for an example expmatrix <- SampleA[,1:3] transformedexp <- Linnorm(expMatrix) transformedexp <- Linnorm(expMatrix, method = "lambda") Linnorm.limma Linnorm-limma pipeline for Differentially Expression Analysis Description This function first performs Linnorm transformation on the dataset. Then, it will perform limma for DEG analysis. Finally, it will correct fold change outputs from limma results, that will be wrong otherwise. Please cite both Linnorm and limma when you use this function for publications. Usage Linnorm.limma(datamatrix, design = NULL, output = "DEResults", noinf = TRUE, showinfo = FALSE, perturbation = 10, minzeroportion = 2/3, robust = TRUE) Arguments datamatrix design output noinf showinfo The matrix or data frame that contains your dataset. Each row is a feature (or Gene) and each column is a sample (or replicate). Undefined values such as NA are not supported. A design matrix required for limma. Please see limma s documentation or our vignettes for more detail. Character. "DEResults" or "Both". Set to "DEResults" to output a matrix that contains Differential Expression Analysis Results. Set to "Both" to output a list that contains both Differential Expression Analysis Results and the transformed data matrix. Logical. Prevent generating INF in the fold change column by using Linnorm s lambda and adding one. If it is set to FALSE, INF will be generated if one of the conditions has zero expression. Defaults to TRUE. Logical. Show lambda value calculated. Defaults to FALSE.

4 Linnorm.limma Details Value perturbation Integer >=2. To search for an optimal minimal deviation parameter (please see the article), Linnorm uses the iterated local search algorithm which perturbs away from the initial local minimum. The range of the area searched in each perturbation is exponentially increased as the area get further away from the initial local minimum, which is determined by their index. This range is calculated by 10 * (perturbation ^ index). minzeroportion Double >=0, <= 1. Featuress without at least this portion of non-zero values will not be used in the calculation of normalizing parameter. Defaults to 2/3. robust Logical. In the ebayes function of Limma, run with robust setting with TRUE or FALSE. Defaults to TRUE. This function performs both Linnorm and limma for users who are interested in differential expression analysis. Please note that if you directly use a Linnorm Nomralized dataset with limma, the output fold change and average expression with be wrong. (p values and adj.pvalues will be fine.) This is because the voom-limma pipeline assumes input to be in raw counts. This function is written to fix this problem. If output is set to "DEResults", this function will output a matrix with Differntial Expression Analysis Results with the following columns: logfc: Log 2 Fold Change XPM: Average Expression. If input is raw count or CPM, this column has the CPM unit. If input is RPKM, FPKM or TPM, this column has the TPM unit. t: moderated t-statistic P.Value: p value adj.p.val: Adjusted p value. This is also called False Discovery Rate or q value. B: log odds that the feature is differential If output is set to Both, this function will output a list with the following objects: Examples DEResults: Differntial Expression Analysis Results as described above. TMatrix: A Linnorm Transformed Expression Matrix. #Obtain example matrix: library(seqc) SampleA <- ILM_aceview_gene_BGI[,grepl("A_",colnames(ILM_aceview_gene_BGI))] rownames(samplea) <- ILM_aceview_gene_BGI[,2] SampleB <- ILM_aceview_gene_BGI[,grepl("B_",colnames(ILM_aceview_gene_BGI))] rownames(sampleb) <- ILM_aceview_gene_BGI[,2] #Extract a portion of the matrix for an example expmatrix <- cbind(samplea[,1:3], SampleB[,1:3]) designmatrix <- c(1,1,1,2,2,2) designmatrix <- model.matrix(~ 0+factor(designmatrix))

RnaXSim 5 colnames(designmatrix) <- c("group1", "group2") rownames(designmatrix) <- colnames(expmatrix) #Example 1 DEGResults <- Linnorm.limma(expMatrix, designmatrix) #Example 2 DEGResults <- Linnorm.limma(expMatrix, designmatrix, output="both") RnaXSim This function simulates a RNA-seq dataset based on a given distribution. Description This function simulates a RNA-seq dataset based on a given distribution. Usage RnaXSim(thisdata, distribution = "Poisson", NumRep = 3, NumDiff = 2000, NumFea = 20000, showinfo = FALSE, DEGlog2FC = "Auto", MaxLibSizelog2FC = 0.5) Arguments thisdata distribution NumRep Matrix: The matrix or data frame that contains your dataset. Each row is a gene and each column is a replicate. Undefined values such as NA are not supported. This program assumes that all columns are replicates of the same sample. Character: Defaults to "Poisson". This parameter controls the output distribution of the simulated RNA-seq dataset. It can be one of "Gamma" (Gamma distribution), "Poisson" (Poisson distribution), "LogNorm" (Log Normal distribution) or "NB" (Negative Binomial distribution). Integer: The number of replicates. This is half of the number of output samples. Defaults to 3. NumDiff Integer: The number of Differentially Changed Features. Defaults to 2000. NumFea Integer: The number of Total Features. Defaults to 20000. showinfo Logical: should we show data information on the console? Defaults to FALSE. DEGlog2FC "Auto" or Double: log 2 fold change threshold that defines differentially expressed genes. If set to "Auto," DEGlog2FC is defined at the level where ANOVA can get a q value of 0.05 with the median mean, where the data values are log1p transformed. Defaults to "Auto". MaxLibSizelog2FC Double: The maximum library size difference from the mean that is allowed, in terms of log 2 fold change. Set to 0 to prevent program from generating library size differences. Defaults to 0.5.

6 RnaXSim Value This function returns a list that contains a matrix of count data in integer raw count and a vector that shows which genes are differentially expressed. In the matrix, each row is a gene and each column is a replicate. The first NumRep (see parameter) of the columns belong to sample 1, and the last NumRep (see parameter) of the columns belong to sample 2. There will be NumFea (see parameter) number of rows. Examples #Obtain example matrix: library(seqc) SampleA <- ILM_aceview_gene_BGI[,grepl("A_",colnames(ILM_aceview_gene_BGI))] rownames(samplea) <- ILM_aceview_gene_BGI[,2] #Extract a portion of the matrix for an example expmatrix <- SampleA[,1:10] #Example for Negative Binomial distribution simulateddata <- RnaXSim(expMatrix, distribution="nb", NumRep=3, NumDiff = 200, NumFea = 2000) #Example for Poisson distribution simulateddata <- RnaXSim(expMatrix, distribution="poisson", NumRep=3, NumDiff = 200, NumFea = 2000) #Example for Log Normal distribution simulateddata <- RnaXSim(expMatrix, distribution="lognorm", NumRep=3, NumDiff = 200, NumFea = 2000) #Example for Gamma distribution simulateddata <- RnaXSim(expMatrix, distribution="gamma", NumRep=3, NumDiff = 200, NumFea = 2000)

Index Topic CPM Topic Count Topic Expression Topic FPKM Topic Gamma Topic Linnorm Topic Log Topic Negative Topic Parametric Topic Poisson Topic RNA-seq Topic RPKM Topic Raw Topic Simulate Topic Simulation Topic TPM Topic distribution Topic limma Topic normalization Topic transformation 7