Package sigqc. June 13, 2018

Similar documents
Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package bisect. April 16, 2018

Package ECctmc. May 1, 2018

Package SEMrushR. November 3, 2018

Package gggenes. R topics documented: November 7, Title Draw Gene Arrow Maps in 'ggplot2' Version 0.3.2

Package fastdummies. January 8, 2018

Package ggimage. R topics documented: December 5, Title Use Image in 'ggplot2' Version 0.1.0

Package mmpa. March 22, 2017

Package ggimage. R topics documented: November 1, Title Use Image in 'ggplot2' Version 0.0.7

Package datasets.load

Package fusionclust. September 19, 2017

Package epitab. July 4, 2018

Package Numero. November 24, 2018

Package linkspotter. July 22, Type Package

Package meme. November 2, 2017

Package validara. October 19, 2017

Package canvasxpress

Package jdx. R topics documented: January 9, Type Package Title 'Java' Data Exchange for 'R' and 'rjava'

Package spark. July 21, 2017

Package kirby21.base

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package meme. December 6, 2017

Package gridgraphics

Package weco. May 4, 2018

Package PCADSC. April 19, 2017

Package RNAseqNet. May 16, 2017

Package mgc. April 13, 2018

Package kdtools. April 26, 2018

Package BASiNET. October 2, 2018

Package NFP. November 21, 2016

Package gtrendsr. October 19, 2017

Package projector. February 27, 2018

Package farver. November 20, 2018

Package auctestr. November 13, 2017

Package pairsd3. R topics documented: August 29, Title D3 Scatterplot Matrices Version 0.1.0

Package IATScore. January 10, 2018

Package filematrix. R topics documented: February 27, Type Package

Package catenary. May 4, 2018

Package shinyfeedback

Package gtrendsr. August 4, 2018

Package gifti. February 1, 2018

Package splithalf. March 17, 2018

Package WordR. September 7, 2017

Package pwrab. R topics documented: June 6, Type Package Title Power Analysis for AB Testing Version 0.1.0

Package barcoder. October 26, 2018

Package areaplot. October 18, 2017

Package modmarg. R topics documented:

Package infer. July 11, Type Package Title Tidy Statistical Inference Version 0.3.0

Package vinereg. August 10, 2018

Package SCORPIUS. June 29, 2018

Package EFS. R topics documented:

Package clusternomics

Package optimus. March 24, 2017

Package calpassapi. August 25, 2018

Package facerec. May 14, 2018

Package autocogs. September 22, Title Automatic Cognostic Summaries Version 0.1.1

Package balance. October 12, 2018

Package ClusterBootstrap

Package estprod. May 2, 2018

Package milr. June 8, 2017

Package data.world. April 5, 2018

Package TrafficBDE. March 1, 2018

Package EvolutionaryGames

Package PDN. November 4, 2017

Package omicplotr. March 15, 2019

Package rzeit2. January 7, 2019

Package aop. December 5, 2016

Package editdata. October 7, 2017

Package fcr. March 13, 2018

Package ClusterSignificance

Package TANDEM. R topics documented: June 15, Type Package

Package rmi. R topics documented: August 2, Title Mutual Information Estimators Version Author Isaac Michaud [cre, aut]

Package EMDomics. November 11, 2018

Package dyncorr. R topics documented: December 10, Title Dynamic Correlation Package Version Date

Package tabulizer. June 7, 2018

Package RTNduals. R topics documented: March 7, Type Package

Package phantasus. March 8, 2019

Package TMixClust. July 13, 2018

Package textrank. December 18, 2017

Package quickreg. R topics documented:

Package qualmap. R topics documented: September 12, Type Package

Package queuecomputer

Package pcr. November 20, 2017

Package biosigner. March 6, 2019

Package effectr. January 17, 2018

Package docxtools. July 6, 2018

Package semisup. March 10, Version Title Semi-Supervised Mixture Model

Package pdfsearch. July 10, 2018

Package hypercube. December 15, 2017

Package essurvey. August 23, 2018

Package widyr. August 14, 2017

Package mapsapi. March 12, 2018

Package metaheur. June 30, 2016

Package robotstxt. November 12, 2017

Package cregulome. September 13, 2018

Package internetarchive

Package GFD. January 4, 2018

Package tiler. June 9, 2018

Package zebu. R topics documented: October 24, 2017

Package multihiccompare

Transcription:

Title Quality Control Metrics for Gene Signatures Version 0.1.20 Package sigqc June 13, 2018 Description Provides gene signature quality control metrics in publication ready plots. Namely, enables the visualization of properties such as expression, variability, correlation, and comparison of methods of standardisation and scoring metrics. Depends R (>= 3.3.0) License file LICENSE License_restricts_use yes Encoding UTF-8 LazyData true RoxygenNote 6.0.1 Imports MASS, lattice, KernSmooth, cluster, nnet, class, gridgraphics, biclust, gplots, ComplexHeatmap, RankProd, fmsb, moments, grdevices, graphics, stats, utils, mclust, GSVA, circlize Suggests knitr, rmarkdown VignetteBuilder knitr NeedsCompilation no Author Andrew Dhawan [aut], Alessandro Barberis [aut], Wei-Chen Cheng [aut], Francesca Buffa [aut, cre] Maintainer Francesca Buffa <francesca.buffa@oncology.ox.ac.uk> Repository CRAN Date/Publication 2018-06-13 12:43:24 UTC R topics documented: make_all_plots....................................... 2 Index 5 1

2 make_all_plots make_all_plots make_all_plots.r Description Usage Makes all the plots for the quality control of the list(s) of genes in a specified output directory (out_dir). Plots (PDFs) are made for all combinations of gene expression datasets and gene signatures inputted. For the purposes of this protocol, gene signatures are defined as sets of genes for which there is a coherent pattern of expression, in conjunction with a biological process or clinical outcome. The methodology is based on the sigqc protocol defined in the manuscript by Dhawan et al. at: https://www.biorxiv.org/content/early/2017/11/13/203729. make_all_plots(gene_sigs_list, mrna_expr_matrix, names_sigs = NULL, names_datasets = NULL, covariates = NULL, thresholds = NULL, out_dir = tempdir(), showresults = TRUE, origin = NULL, donegativecontrol = TRUE, numresampling = 50) Arguments gene_sigs_list A list object, for the gene signatures. The name reference for each list element should correspond to the name of the gene signature. This list consists of k-by-1 character matrices of k gene names, which comprise the gene signature. Genes must be annotated in the same manner as the rows of the data matrix; at least one gene name must be present in the rownames of the gene expression matrices for the signature to be evaluated on that dataset. mrna_expr_matrix A list of matrices of expression values for the datasets to be considered, which must contain at least 2 samples per dataset. One numeric matrix entry per dataset. Name reference of each list entry should correspond to the name of the dataset. The rows are to be labelled as the genes, all annotated in the same way, and columns are sample IDs. Expression values should be normalised, batchcorrected, standardised, and log-transformed if needed, prior to use in sigqc. We recommend normalisation, batch correction, and log-transformation prior to use. Care must be taken to remove samples displaying a high proportion of NA values, especially for signature genes. names_sigs The names of the gene signatures (e.g. Hypoxia, Invasiveness), one name per each signature in gene_sigs_list. Corresponds to the names of the entries of the list. names_datasets The names of the different datasets contained in mrna_expr_matrix. Corresponds to the names of the entries of the list. covariates A list containing a sub-list of annotations and colors which contains the annotation matrix for the given dataset and the associated colours with which to plot in the expression heatmap. This is in the same form as used by the ComplexHeatmap package. One sub-list per dataset is used, referenced by the same name as given by the dataset in the mrna_expr_matrix list.

make_all_plots 3 thresholds out_dir showresults A list of expression thresholds to be considered for each data set, default is median of the data set. A gene is considered expressed if above the threshold, non-expressed otherwise. One threshold per dataset, in the same order as the dataset list. Note that this is only used for the reporting of the genes showing supra-threshold expression across each dataset. Genes are not removed from computation based on expression; but proportion above this threshold is reported to the user. This is defaulted to the median level of all genes across all samples for a given dataset. A path to the directory where the resulting output files are written. Default is R temporary directory, given by tempdir(). Tells if R should open plot windows showing the computed results. Default is TRUE. Regardless of value, all plots are saved to PDF files in the output directory. origin Tells if datasets have come from different labs/experiments/machines. This is a vector of characters, with same character representing same origin. Default is assumption that all datasets come from the same source. Used in the correction of batch effects during the RankProduct computation for poorly auto-correlated signature genes. Only to be used if multiple datasets are present. donegativecontrol Logical, tells the function if negative and permutation controls should be computed. TRUE by default. Note that depending on the number of resamplings, setting this parameter to TRUE may result in much longer runtimes. Negative controls in this context refers to resampling based on random genes selected with the same length as the gene signatures in question. Permutation controls are generated by considering the same genes as each signature in each dataset, but with labels of the genes permuted for each sample. numresampling Examples Number of bootstrap re-samplings of random gene signatures of the same length as the signature from which to compute null distribution of each metric, for each dataset and gene signature combination. This is the same value used for the nubmer of permutations of dataset values to consider in the permutation controls as described above, where in each dataset the labels of the signature genes are permuted for each sample. The default value for this parameter is set to 50. library(sigqc) names = c("dataset1") data.matrix = replicate(10, rnorm(50))#random matrix - 50 genes x 10 samples mrna_expr_matrix = list() mrna_expr_matrix[["dataset1"]] = data.matrix row.names(mrna_expr_matrix$dataset1) <- as.character(1:(dim(mrna_expr_matrix$dataset1)[1])) colnames(mrna_expr_matrix$dataset1) <- as.character(1:(dim(mrna_expr_matrix$dataset1)[2])) #Define the signature gene_sigs_list = list() signature = "hypoxiasig" gene_sig = c('1', '4', '5')#gene ids gene_sigs_list[[signature]] = as.matrix(gene_sig) names_sigs = c(signature) make_all_plots(gene_sigs_list = gene_sigs_list, mrna_expr_matrix = mrna_expr_matrix,

4 make_all_plots donegativecontrol=false, out_dir = tempdir(), showresults=false)

Index Topic make_all_plots make_all_plots, 2 make_all_plots, 2 5