Main Results. Kevin R, Coombes. 10 September 2011

Size: px
Start display at page:

Download "Main Results. Kevin R, Coombes. 10 September 2011"

Transcription

1 Main Results Kevin R, Coombes 10 September 2011 Contents 1 Executive Summary Introduction Aims/Objectives Methods Description of the Data Statistical Methods Results Conclusions Details 2 3 Deletion 11q 5 4 Deletion 13q 6 5 Deletion 17p 8 6 Trisomy Appendix 13 1 Executive Summary 1.1 Introduction This report describes the analysis of a data set from Carmen Schweighofer, a member of the laboratory of Lynne V. Abruzzo. This dataset was acquired using Illumina 610 SNP chips Aims/Objectives To see how well our new algorithm works to identify clonal (or fractional) copy number change. To see how well the SNP calls match the FISH data for the standard set of abnormalities. 1

2 108-clones Methods Description of the Data Statistical Methods 1.3 Results 1.4 Conclusions 2 Details Load the segmented data > source("00rnw/snp-utils.r") > load("rewind.rda") Load some clinical data, and give simpler names to the FISH results. > load("o:/private/abruzzo/finaldatacopies/readytogo.rda") > fish12 <- clin$x..positive.cells.for.trisomy.12 > fish13 <- clin$x..positive.cells.for.del13q14.3..d13s319. > fish17 <- clin$x..positive.cells.for.del.17p13.1..p53. > fish11 <- clin$x..positive.cells.for.del.11q22.3..atm. > fish13lamp <- clin$x..positive.cells.for.del.13q34.lamp1. > names(fish12) <- names(fish11) <- names(fish13) <- names(fish13lamp) <- names(fish17) <- rown hex code: > hexagon <- function(x, y, radius, aspect = 1,...) { + circ <- seq(0, 2 * pi, length = 7) + X <- x + radius * cos(circ) + Y <- y + radius * aspect * sin(circ) + polygon(x, Y,...) > loser <- function(y) { + list(beta = (1 - y)/(2 - y), lambda = log10((2 - y)/2)) > lossf <- function(beta) -log10(2 * (1 - beta)) > lossp <- function(beta) (1-2 * beta)/(1 - beta) > winner <- function(z) { + list(beta = 1/(2 + z), lambda = -log10((2 + z)/2)) > gainf <- function(beta) -log10(2 * beta) Recall how we got here:

3 108-clones 3 Now we handle a very special case: samples with copy number two but four BAF bands. These always represent a mixture of two (clonal) cell populations. If they are actually copy number two, then it is a mixture of normalcells and a clone with LOH. However, a mixture of normal cells and a clone with single-copy-loss can look similar, but with a (slightly) negative LRR. If we know the percentage of cells with a single-copy loss, we can compute the expected value of the BAF component and of the LRR. Figure 1 shows the theoretical curve (in red) and the boundary between the fractional loss and fractional LOH components. > bb <- seq(0, 0.5, length = 300) > ll <- -log10(2 * (1 - bb)) > gg <- -log10(2 * bb) > w0 <- rewind$call == "N02.UnbalHet" > w0 <- rewind$call!= "N02.BalHet" > w0 <- rewind$nbafcomp == 3 > w0 <- rewind$nbafcomp == 4 &!is.na(rewind$nbafcomp) > special <- rewind > special <- rewind[w0, ] > bet <- special$mix > lam <- special$seg.median > l2 <- -log10(2 * (1 - bet))/2 > g2 <- -log10(2 * bet)/2 > bog <- data.frame(special[, c("chrom", "SamID", "seg.median", "Mix", + "Call")], l2 = l2, clonal = lam < l2 & lam > -0.15, g2 = g2) > showme <- function(position, logrr, bafr, sam, why, chr, start, end, + gene) { + chunk <- which(rewind$chrom == chr & rewind$samid == sam & rewind$loc.start <= + end + M & rewind$loc.end >= start - M) + ptcol <- "#888888" + genecol <- "purple" + opar <- par(mfrow = c(2, 1)) + top <- max(c(0, logrr), na.rm = TRUE) + n <- mean(logrr < -1, na.rm = TRUE) + bot <- ifelse(n > 0.05, min(logrr, na.rm = TRUE), -1) + plot(position, logrr, ylim = c(bot, top), main = paste(sam, "; Chr ", + chr, sep = ""), xlab = "Base position", ylab = "Log R Ratio", + pch = 16, col = ptcol) + abline(h = log10(1/2), col = "orange") + abline(h = log10(2/2), col = "blue") + abline(h = log10(3/2), col = "green") + for (i in chunk) {

4 108-clones BAF Component Log R Ratio Figure 1: Relation between B allele freqency and log R ratio for segments initially called N02.UnbalHet. Blue points retain this call; pink points are re-called as N01.UnbalHet, represnting a fractional loss. The red curve is the theoretical location of the (BAF,LRR) value for fractional losses. Vertical green bars are the locations corresponding (from right to left) to losses in 5, 10, 15, and 20% of the cells.

5 108-clones 5 + me <- rewind[i, "seg.median"] + lines(c(rewind[i, "loc.start"], rewind[i, "loc.end"]), c(me, + me), lwd = 3) + abline(v = c(start, end), col = genecol) + mtext(gene, side = 3, at = (start + end)/2, col = genecol, line = 0.5) + plot(position, bafr, main = why, xlab = "Base position", ylab = "B Allele Frequency", + pch = 16, col = ptcol) + abline(v = c(start, end), col = genecol) + abline(h = 0.5) + mtext(gene, side = 3, at = (start + end)/2, col = genecol, line = 0.5) + par(opar) 3 Deletion 11q > M <- 1e+06 > st11 < > en11 < > w <- which(special$chrom == 11 & special$loc.start < en11 & special$loc.end > + st11) > length(w) [1] 5 > who <- as.character(special$samid)[w] > table(snp = clin$stat11, FISH = fish11 > 0) FISH SNP FALSE TRUE Abnormal 2 16 Normal 94 3 > w1 <- names(which(clin$stat11 == "Abnormal" & fish11 == 0)) > w2 <- names(which(clin$stat11 == "Normal" & fish11 > 0)) > if (!file.exists("usualsuspects")) dir.create("usualsuspects") > seeme <- unique(c(who, w1, w2)) > for (sam in seeme) { + why <- "" + if (sam %in% w1) + why <- paste(why, "SNP+, FISH-") + if (sam %in% w2) + why <- paste(why, "SNP-, FISH+")

6 108-clones 6 + if (sam %in% who) + why <- paste(why, "clonal?") + lsd <- loadsnpdata(sam, "11") + posn <- lsd$position + lrr <- lsd$log.r.ratio + baf <- lsd$b.allele.freq + locus <- posn >= st11 - M & posn <= en11 + M + png(file = file.path("usualsuspects", paste("chr11-", sam, ".png", + sep = "")), height = 480, width = 960, bg = "white") + par(bg = "white") + showme(posn[locus], lrr[locus], baf[locus], sam, why, 11, st11, + en11, "ATM") + dev.off() > data.frame(bog[w, ], fish11 = fish11[who], pcell = round(100 * lossp(bog[w, + "Mix"]), 2)) chrom SamID seg.median Mix Call l2 clonal g CL N02.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet TRUE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE fish11 pcell NA NA > ww <- (rewind$chrom == 11 & rewind$loc.start < en11 & rewind$loc.end > + st11) > summary(ww) Mode FALSE TRUE NA's logical Deletion 13q > st13 < > en13 < > w <- which(special$chrom == 13 & special$loc.start < en13 & special$loc.end >

7 108-clones 7 + st13) > length(w) [1] 14 > who <- as.character(special$samid)[w] > table(snp = clin$stat13, FISH = fish13 > 0) FISH SNP FALSE TRUE Abnormal 6 50 Normal 53 6 > w1 <- names(which(clin$stat13 == "Abnormal" & fish13 == 0)) > w1 <- w1[-which(w1 == "CL113")] > w2 <- names(which(clin$stat13 == "Normal" & fish13 > 0)) > if (!file.exists("usualsuspects")) dir.create("usualsuspects") > seeme <- unique(c(who, w1, w2)) > for (sam in seeme) { + why <- "" + if (sam %in% w1) + why <- paste(why, "SNP+, FISH-") + if (sam %in% w2) + why <- paste(why, "SNP-, FISH+") + if (sam %in% who) + why <- paste(why, "clonal?") + lsd <- loadsnpdata(sam, "13") + posn <- lsd$position + lrr <- lsd$log.r.ratio + baf <- lsd$b.allele.freq + locus <- posn >= st13 - M & posn <= en13 + M + png(file = file.path("usualsuspects", paste("chr13-", sam, ".png", + sep = "")), height = 480, width = 960, bg = "white") + par(bg = "white") + showme(posn[locus], lrr[locus], baf[locus], sam, why, 13, st13, + en13, "MIR16-1 through DLEU7") + dev.off() > data.frame(bog[w, ], fish13 = fish13[who], pcell = round(100 * lossp(bog[w, + "Mix"]), 2)) chrom SamID seg.median Mix Call l2 clonal g CL N01.UnbalHet TRUE

8 108-clones CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet TRUE CL N01.UnbalHet TRUE CL N01.UnbalHet TRUE CLZ N01.UnbalHet FALSE CLZ N01.UnbalHet TRUE fish13 pcell NA NA NA NA NA NA NA > ww <- (rewind$chrom == 13 & rewind$loc.start < en13 & rewind$loc.end > + st13) > summary(ww) Mode FALSE TRUE NA's logical Deletion 17p > st17 < > en17 < > w <- which(special$chrom == 17 & special$loc.start < en17 & special$loc.end > + st17) > length(w)

9 108-clones 9 [1] 7 > who <- as.character(special$samid)[w] > table(snp = clin$stat17, FISH = fish17 > 0) FISH SNP FALSE TRUE Abnormal 1 6 Normal > w1 <- names(which(clin$stat17 == "Abnormal" & fish17 == 0)) > w2 <- names(which(clin$stat17 == "Normal" & fish17 > 0)) > if (!file.exists("usualsuspects")) dir.create("usualsuspects") > seeme <- unique(c(who, w1, w2)) > for (sam in seeme) { + why <- "" + if (sam %in% w1) + why <- paste(why, "SNP+, FISH-") + if (sam %in% w2) + why <- paste(why, "SNP-, FISH+") + if (sam %in% who) + why <- paste(why, "clonal?") + lsd <- loadsnpdata(sam, "17") + posn <- lsd$position + lrr <- lsd$log.r.ratio + baf <- lsd$b.allele.freq + locus <- posn >= st17 - M & posn <= en17 + M + png(file = file.path("usualsuspects", paste("chr17-", sam, ".png", + sep = "")), height = 480, width = 960, bg = "white") + par(bg = "white") + showme(posn[locus], lrr[locus], baf[locus], sam, why, 17, st17, + en17, "TP53") + dev.off() > data.frame(bog[w, ], fish17 = fish17[who], pcell = round(100 * lossp(bog[w, + "Mix"]), 2)) chrom SamID seg.median Mix Call l2 clonal g CL N01.UnbalHet TRUE CL N02.UnbalHet FALSE CL N02.UnbalHet FALSE CL N01.UnbalHet TRUE CL N01.UnbalHet TRUE

10 108-clones CL N01.UnbalHet TRUE CL N01.UnbalHet TRUE fish17 pcell NA > ww <- (rewind$chrom == 17 & rewind$loc.start < en17 & rewind$loc.end > + st17) > summary(ww) Mode FALSE TRUE NA's logical Trisomy 12 > st12 <- M > en12 < M > w <- which(special$chrom == 12 & special$loc.start < en12 & special$loc.end > + st12) > length(w) [1] 550 > who <- as.character(special$samid)[w] > table(snp = clin$stat12, FISH = fish12 > 0) FISH SNP FALSE TRUE Abnormal 2 21 Normal 91 1 > w1 <- names(which(clin$stat12 == "Abnormal" & fish12 == 0)) > w2 <- names(which(clin$stat12 == "Normal" & fish12 > 0)) > if (!file.exists("usualsuspects")) dir.create("usualsuspects") > seeme <- unique(c(w1, w2)) > for (sam in seeme) { + why <- "" + if (sam %in% w1)

11 108-clones 11 + why <- paste(why, "SNP+, FISH-") + if (sam %in% w2) + why <- paste(why, "SNP-, FISH+") + lsd <- loadsnpdata(sam, "12") + posn <- lsd$position + lrr <- lsd$log.r.ratio + baf <- lsd$b.allele.freq + locus <- posn >= st12 - M & posn <= en12 + M + png(file = file.path("usualsuspects", paste("chr12-", sam, ".png", + sep = "")), height = 480, width = 960, bg = "white") + par(bg = "white") + showme(posn[locus], lrr[locus], baf[locus], sam, why, 12, st12, + en12, "") + dev.off() > ww <- (rewind$chrom == 12 & rewind$loc.start < en12 & rewind$loc.end > + st12) > summary(ww) Mode FALSE TRUE NA's logical > w <- which(special$chrom == 13 & special$loc.start < en13 & special$loc.end > + st13) > length(w) [1] 14 > who <- as.character(special$samid)[w] > data.frame(bog[w, ], fish13 = fish13[who]) chrom SamID seg.median Mix Call l2 clonal g CL N01.UnbalHet TRUE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet FALSE CL N01.UnbalHet TRUE CL N01.UnbalHet TRUE

12 108-clones CL N01.UnbalHet TRUE CLZ N01.UnbalHet FALSE CLZ N01.UnbalHet TRUE fish NA NA NA NA NA NA NA > ww <- (rewind$chrom == 13 & rewind$loc.start < en13 & rewind$loc.end > + st13) > summary(ww) Mode FALSE TRUE NA's logical > chug <- fish13 > 0 > table(snp = clin$stat12, FISH = fish12 > 0) FISH SNP FALSE TRUE Abnormal 2 21 Normal 91 1 > w <- which(special$chrom == 12 & special$loc.start < en12 & special$loc.end > + st12) > length(w) [1] 550 > who <- as.character(special$samid)[w] > ww <- (rewind$chrom == 12 & rewind$loc.start < en12 & rewind$loc.end > + st12) > summary(ww)

13 108-clones 13 Mode FALSE TRUE NA's logical > plot(c(0, 0.5), c(-0.3, log10(8/2)), type = "n") > a <- 0.9/0.5 > r < > hexagon(0.5, 0, r, a, col = "gray") > hexagon(0, , r, a, col = "gray") > for (i in 3:8) hexagon(1/i, log10(i/2), r, a, col = "gray") > lines(bb, gg, col = "green", lwd = 2) > lines(bb, gg/2, col = "orange", lwd = 2) > lines(bb, ll, col = "red", lwd = 2) > abline(h = 0, col = "blue", lwd = 2) > lines(bb, ll/2, col = "purple", lwd = 2) > perc <- seq(0, 1, by = 0.1) > bl <- (1 - perc)/(2 - perc) > points(bl, lossf(bl), pch = 16, col = "black") > bg <- 1/(2 + perc) > points(bg, gainf(bg), pch = 16, col = "black") > points(perc/2, 0 * perc, pch = 16, col = "black") > lossf(0.1) [1] > whatever <- lam < l2 & lam > Appendix This analysis was run in the following directory: > getwd() [1] "c:/snp-cll/analysis03" Note that \\mdadqsfs02\bioinfo2 is the standard insititutional location for storing data and analyses; O: is the name given to that location on this machine. This analysis was run in the following software environment: > sessioninfo() R version ( ) Platform: x86_64-pc-mingw32/x64 (64-bit)

14 108-clones 14 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grdevices utils datasets methods base

Plotting Segment Calls From SNP Assay

Plotting Segment Calls From SNP Assay Plotting Segment Calls From SNP Assay Kevin R. Coombes 17 March 2011 Contents 1 Executive Summary 1 1.1 Introduction......................................... 1 1.1.1 Aims/Objectives..................................

More information

Preliminary Figures for Renormalizing Illumina SNP Cell Line Data

Preliminary Figures for Renormalizing Illumina SNP Cell Line Data Preliminary Figures for Renormalizing Illumina SNP Cell Line Data Kevin R. Coombes 17 March 2011 Contents 1 Executive Summary 1 1.1 Introduction......................................... 1 1.1.1 Aims/Objectives..................................

More information

Preparing the Final Data Set

Preparing the Final Data Set Preparing the Final Data Set Kevin R. Coombes 17 March 2011 Contents 1 Executive Summary 1 1.1 Introduction......................................... 1 1.1.1 Aims/Objectives..................................

More information

Reorganizing the data by sample

Reorganizing the data by sample Reorganizing the data by sample Kevin R. Coombes 17 March 2011 Contents 1 Executive Summary 1 1.1 Introduction......................................... 1 1.1.1 Aims/Objectives..................................

More information

Reorganizing the data by sample

Reorganizing the data by sample Reorganizing the data by sample Kevin R. Coombes 23 March 2011 Contents 1 Executive Summary 1 1.1 Introduction......................................... 1 1.1.1 Aims/Objectives..................................

More information

Shrinkage of logarithmic fold changes

Shrinkage of logarithmic fold changes Shrinkage of logarithmic fold changes Michael Love August 9, 2014 1 Comparing the posterior distribution for two genes First, we run a DE analysis on the Bottomly et al. dataset, once with shrunken LFCs

More information

Introduction to R for Epidemiologists

Introduction to R for Epidemiologists Introduction to R for Epidemiologists Jenna Krall, PhD Thursday, January 29, 2015 Final project Epidemiological analysis of real data Must include: Summary statistics T-tests or chi-squared tests Regression

More information

Statistical Programming with R

Statistical Programming with R Statistical Programming with R Lecture 9: Basic graphics in R Part 2 Bisher M. Iqelan biqelan@iugaza.edu.ps Department of Mathematics, Faculty of Science, The Islamic University of Gaza 2017-2018, Semester

More information

Pooling Segments. Kevin R. Coombes. 17 March 2011

Pooling Segments. Kevin R. Coombes. 17 March 2011 Pooling Segments Kevin R. Coombes 17 March 2011 Contents 1 Executive Summary 2 1.1 Introduction......................................... 2 1.1.1 Aims/Objectives.................................. 2 1.2

More information

R.devices. Henrik Bengtsson. November 19, 2012

R.devices. Henrik Bengtsson. November 19, 2012 R.devices Henrik Bengtsson November 19, 2012 Abstract The R.devices package provides utility methods that enhance the existing graphical device functions already available in R for the purpose of simplifying

More information

Graphics - Part III: Basic Graphics Continued

Graphics - Part III: Basic Graphics Continued Graphics - Part III: Basic Graphics Continued Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Highway MPG 20 25 30 35 40 45 50 y^i e i = y i y^i 2000 2500 3000 3500 4000 Car Weight Copyright

More information

Statistical Programming Camp: An Introduction to R

Statistical Programming Camp: An Introduction to R Statistical Programming Camp: An Introduction to R Handout 3: Data Manipulation and Summarizing Univariate Data Fox Chapters 1-3, 7-8 In this handout, we cover the following new materials: ˆ Using logical

More information

Stats with R and RStudio Practical: basic stats for peak calling Jacques van Helden, Hugo varet and Julie Aubert

Stats with R and RStudio Practical: basic stats for peak calling Jacques van Helden, Hugo varet and Julie Aubert Stats with R and RStudio Practical: basic stats for peak calling Jacques van Helden, Hugo varet and Julie Aubert 2017-01-08 Contents Introduction 2 Peak-calling: question...........................................

More information

Performance assessment of vsn with simulated data

Performance assessment of vsn with simulated data Performance assessment of vsn with simulated data Wolfgang Huber November 30, 2008 Contents 1 Overview 1 2 Helper functions used in this document 1 3 Number of features n 3 4 Number of samples d 3 5 Number

More information

ddhazard Diagnostics Benjamin Christoffersen

ddhazard Diagnostics Benjamin Christoffersen ddhazard Diagnostics Benjamin Christoffersen 2017-11-25 Introduction This vignette will show examples of how the residuals and hatvalues functions can be used for an object returned by ddhazard. See vignette("ddhazard",

More information

GS Analysis of Microarray Data

GS Analysis of Microarray Data GS01 0163 Analysis of Microarray Data Keith Baggerly and Kevin Coombes Section of Bioinformatics Department of Biostatistics and Applied Mathematics UT M. D. Anderson Cancer Center kabagg@mdanderson.org

More information

Introduction to R 21/11/2016

Introduction to R 21/11/2016 Introduction to R 21/11/2016 C3BI Vincent Guillemot & Anne Biton R: presentation and installation Where? https://cran.r-project.org/ How to install and use it? Follow the steps: you don t need advanced

More information

R/BioC Exercises & Answers: Unsupervised methods

R/BioC Exercises & Answers: Unsupervised methods R/BioC Exercises & Answers: Unsupervised methods Perry Moerland April 20, 2010 Z Information on how to log on to a PC in the exercise room and the UNIX server can be found here: http://bioinformaticslaboratory.nl/twiki/bin/view/biolab/educationbioinformaticsii.

More information

RAPIDR. Kitty Lo. November 20, Intended use of RAPIDR 1. 2 Create binned counts file from BAMs Masking... 1

RAPIDR. Kitty Lo. November 20, Intended use of RAPIDR 1. 2 Create binned counts file from BAMs Masking... 1 RAPIDR Kitty Lo November 20, 2014 Contents 1 Intended use of RAPIDR 1 2 Create binned counts file from BAMs 1 2.1 Masking.................................................... 1 3 Build the reference 2 3.1

More information

A short Introduction to UCSC Genome Browser

A short Introduction to UCSC Genome Browser A short Introduction to UCSC Genome Browser Elodie Girard, Nicolas Servant Institut Curie/INSERM U900 Bioinformatics, Biostatistics, Epidemiology and computational Systems Biology of Cancer 1 Why using

More information

INTRODUCTION TO R. Basic Graphics

INTRODUCTION TO R. Basic Graphics INTRODUCTION TO R Basic Graphics Graphics in R Create plots with code Replication and modification easy Reproducibility! graphics package ggplot2, ggvis, lattice graphics package Many functions plot()

More information

Plotting: An Iterative Process

Plotting: An Iterative Process Plotting: An Iterative Process Plotting is an iterative process. First we find a way to represent the data that focusses on the important aspects of the data. What is considered an important aspect may

More information

Preservation of protein-protein interaction networks Simple simulated example

Preservation of protein-protein interaction networks Simple simulated example Preservation of protein-protein interaction networks Simple simulated example Peter Langfelder and Steve Horvath May, 0 Contents Overview.a Setting up the R session............................................

More information

BICF Nano Course: GWAS GWAS Workflow Development using PLINK. Julia Kozlitina April 28, 2017

BICF Nano Course: GWAS GWAS Workflow Development using PLINK. Julia Kozlitina April 28, 2017 BICF Nano Course: GWAS GWAS Workflow Development using PLINK Julia Kozlitina Julia.Kozlitina@UTSouthwestern.edu April 28, 2017 Getting started Open the Terminal (Search -> Applications -> Terminal), and

More information

Replication of paper: Women As Policy Makers: Evidence From A Randomized Policy Experiment In India

Replication of paper: Women As Policy Makers: Evidence From A Randomized Policy Experiment In India Replication of paper: Women As Policy Makers: Evidence From A Randomized Policy Experiment In India Matthieu Stigler October 3, 2013 Try to replicate paper Raghabendra Chattopadhyay & Esther Duo, (2004).

More information

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010 UCLA Statistical Consulting Center R Bootcamp Irina Kukuyeva ikukuyeva@stat.ucla.edu September 20, 2010 Outline 1 Introduction 2 Preliminaries 3 Working with Vectors and Matrices 4 Data Sets in R 5 Overview

More information

Package R2SWF. R topics documented: February 15, Version 0.4. Title Convert R Graphics to Flash Animations. Date

Package R2SWF. R topics documented: February 15, Version 0.4. Title Convert R Graphics to Flash Animations. Date Package R2SWF February 15, 2013 Version 0.4 Title Convert R Graphics to Flash Animations Date 2012-07-14 Author Yixuan Qiu and Yihui Xie Maintainer Yixuan Qiu Suggests XML, Cairo

More information

Raman Spectra of Chondrocytes in Cartilage: hyperspec s chondro data set

Raman Spectra of Chondrocytes in Cartilage: hyperspec s chondro data set Raman Spectra of Chondrocytes in Cartilage: hyperspec s chondro data set Claudia Beleites CENMAT and DI3, University of Trieste Spectroscopy Imaging, IPHT Jena e.v. February 13,

More information

Package sciplot. February 15, 2013

Package sciplot. February 15, 2013 Package sciplot February 15, 2013 Version 1.1-0 Title Scientific Graphing Functions for Factorial Designs Author Manuel Morales , with code developed by the R Development Core Team

More information

Linkage analysis with paramlink Appendix: Running MERLIN from paramlink

Linkage analysis with paramlink Appendix: Running MERLIN from paramlink Linkage analysis with paramlink Appendix: Running MERLIN from paramlink Magnus Dehli Vigeland 1 Introduction While multipoint analysis is not implemented in paramlink, a convenient wrapper for MERLIN (arguably

More information

Examples of implementation of pre-processing method described in paper with R code snippets - Electronic Supplementary Information (ESI)

Examples of implementation of pre-processing method described in paper with R code snippets - Electronic Supplementary Information (ESI) Electronic Supplementary Material (ESI) for Analyst. This journal is The Royal Society of Chemistry 2015 Examples of implementation of pre-processing method described in paper with R code snippets - Electronic

More information

Package OmicCircos. R topics documented: November 17, Version Date

Package OmicCircos. R topics documented: November 17, Version Date Version 1.16.0 Date 2015-02-23 Package OmicCircos November 17, 2017 Title High-quality circular visualization of omics data Author Maintainer Ying Hu biocviews Visualization,Statistics,Annotation

More information

Introduction to R (BaRC Hot Topics)

Introduction to R (BaRC Hot Topics) Introduction to R (BaRC Hot Topics) George Bell September 30, 2011 This document accompanies the slides from BaRC s Introduction to R and shows the use of some simple commands. See the accompanying slides

More information

Using the RCircos Package

Using the RCircos Package Using the RCircos Package Hongen Zhang, Ph.D. Genetics Branch, Center for Cancer Research, National Cancer Institute, NIH August 01, 2016 Contents 1 Introduction 1 2 Input Data Format 2 3 Plot Track Layout

More information

AA BB CC DD EE. Introduction to Graphics in R

AA BB CC DD EE. Introduction to Graphics in R Introduction to Graphics in R Cori Mar 7/10/18 ### Reading in the data dat

More information

Package RLMM. March 7, 2019

Package RLMM. March 7, 2019 Version 1.44.0 Date 2005-09-02 Package RLMM March 7, 2019 Title A Genotype Calling Algorithm for Affymetrix SNP Arrays Author Nusrat Rabbee , Gary Wong

More information

SIBER User Manual. Pan Tong and Kevin R Coombes. May 27, Introduction 1

SIBER User Manual. Pan Tong and Kevin R Coombes. May 27, Introduction 1 SIBER User Manual Pan Tong and Kevin R Coombes May 27, 2015 Contents 1 Introduction 1 2 Using SIBER 1 2.1 A Quick Example........................................... 1 2.2 Dealing With RNAseq Normalization................................

More information

Advanced analysis using bayseq; generic distribution definitions

Advanced analysis using bayseq; generic distribution definitions Advanced analysis using bayseq; generic distribution definitions Thomas J Hardcastle October 30, 2017 1 Generic Prior Distributions bayseq now offers complete user-specification of underlying distributions

More information

A (very) brief introduction to R

A (very) brief introduction to R A (very) brief introduction to R You typically start R at the command line prompt in a command line interface (CLI) mode. It is not a graphical user interface (GUI) although there are some efforts to produce

More information

Module 10. Data Visualization. Andrew Jaffe Instructor

Module 10. Data Visualization. Andrew Jaffe Instructor Module 10 Data Visualization Andrew Jaffe Instructor Basic Plots We covered some basic plots on Wednesday, but we are going to expand the ability to customize these basic graphics first. 2/37 But first...

More information

Running the perf function Kim-Anh Le Cao 01 September 2014

Running the perf function Kim-Anh Le Cao 01 September 2014 Running the perf function Kim-Anh Le Cao 01 September 2014 The function valid has been superseded by the perf function to avoid some selection bias in the sparse functions. This has been fixed. Load the

More information

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Exploring cdna Data Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Practical DNA Microarray Analysis, Heidelberg, March 2005 http://compdiag.molgen.mpg.de/ngfn/pma2005mar.shtml The following

More information

Display Lists in grid

Display Lists in grid Display Lists in grid Paul Murrell December 14, 2009 A display list is a record of drawing operations. It is used to redraw graphics output when a graphics window is resized, when graphics output is copied

More information

CRImage a package for classifying cells and calculating tumour cellularity

CRImage a package for classifying cells and calculating tumour cellularity CRImage a package for classifying cells and calculating tumour cellularity Contents Henrik Failmezger, Yinyin Yuan, Oscar Rueda, Florian Markowetz E-mail: henrik.failmezger@cip.ifi.lmu.de 1 Load the package

More information

CQN (Conditional Quantile Normalization)

CQN (Conditional Quantile Normalization) CQN (Conditional Quantile Normalization) Kasper Daniel Hansen khansen@jhsph.edu Zhijin Wu zhijin_wu@brown.edu Modified: August 8, 2012. Compiled: April 30, 2018 Introduction This package contains the CQN

More information

Graphics in R STAT 133. Gaston Sanchez. Department of Statistics, UC Berkeley

Graphics in R STAT 133. Gaston Sanchez. Department of Statistics, UC Berkeley Graphics in R STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/stat133 Base Graphics 2 Graphics in R Traditional

More information

Bio534 Laboratory 1: Solutions

Bio534 Laboratory 1: Solutions Bio534 Laboratory 1: Solutions Exercise 2.1 # 1 > a b a == b [1] TRUE This logic answer implies that the right hand side (rhs) of the equation is equal to the left

More information

How to use cghmcr. October 30, 2017

How to use cghmcr. October 30, 2017 How to use cghmcr Jianhua Zhang Bin Feng October 30, 2017 1 Overview Copy number data (arraycgh or SNP) can be used to identify genomic regions (Regions Of Interest or ROI) showing gains or losses that

More information

Count outlier detection using Cook s distance

Count outlier detection using Cook s distance Count outlier detection using Cook s distance Michael Love August 9, 2014 1 Run DE analysis with and without outlier removal The following vignette produces the Supplemental Figure of the effect of replacing

More information

segmentseq: methods for detecting methylation loci and differential methylation

segmentseq: methods for detecting methylation loci and differential methylation segmentseq: methods for detecting methylation loci and differential methylation Thomas J. Hardcastle October 30, 2018 1 Introduction This vignette introduces analysis methods for data from high-throughput

More information

Creating Special Effects with Text

Creating Special Effects with Text Creating Special Effects with Text Introduction With FrameMaker publishing software, you can create special effects such as large, rotated, outlined, or color characters by putting PostScript code in a

More information

The LDheatmap Package

The LDheatmap Package The LDheatmap Package May 6, 2006 Title Graphical display of pairwise linkage disequilibria between SNPs Version 0.2-1 Author Ji-Hyung Shin , Sigal Blay , Nicholas Lewin-Koh

More information

Correlation. January 12, 2019

Correlation. January 12, 2019 Correlation January 12, 2019 Contents Correlations The Scattterplot The Pearson correlation The computational raw-score formula Survey data Fun facts about r Sensitivity to outliers Spearman rank-order

More information

DATA VISUALIZATION WITH GGPLOT2. Grid Graphics

DATA VISUALIZATION WITH GGPLOT2. Grid Graphics DATA VISUALIZATION WITH GGPLOT2 Grid Graphics ggplot2 internals Explore grid graphics 35 30 Elements of ggplot2 plot 25 How do graphics work in R? 2 plotting systems mpg 20 15 base package grid graphics

More information

Package OmicCircos. R topics documented: November 1, Version Date Title High-quality circular visualization of omic data

Package OmicCircos. R topics documented: November 1, Version Date Title High-quality circular visualization of omic data Package OmicCircos November 1, 2013 Version 1.0.0 Date 2013-05-28 Title High-quality circular visualization of omic data Author Maintainer Ying Hu biocviews Visualization,Statistics,Annotation

More information

Using crlmm for copy number estimation and genotype calling with Illumina platforms

Using crlmm for copy number estimation and genotype calling with Illumina platforms Using crlmm for copy number estimation and genotype calling with Illumina platforms Rob Scharpf November, Abstract This vignette illustrates the steps necessary for obtaining marker-level estimates of

More information

Exploratory Projection Pursuit

Exploratory Projection Pursuit Exploratory Projection Pursuit (Jerome Friedman, PROJECTION PURSUIT METHODS FOR DATA ANALYSIS, June. 1980, SLAC PUB-2768) Read in required files drive - D: code.dir - paste(drive, DATA/Data Mining R-Code,

More information

Short tutorial on studying module preservation: Preservation of female mouse liver modules in male data

Short tutorial on studying module preservation: Preservation of female mouse liver modules in male data Short tutorial on studying module preservation: Preservation of female mouse liver modules in male data Peter Langfelder and Steve Horvath October 1, 0 Contents 1 Overview 1 1.a Setting up the R session............................................

More information

jackstraw: Statistical Inference using Latent Variables

jackstraw: Statistical Inference using Latent Variables jackstraw: Statistical Inference using Latent Variables Neo Christopher Chung August 7, 2018 1 Introduction This is a vignette for the jackstraw package, which performs association tests between variables

More information

genocn: integrated studies of copy number and genotype

genocn: integrated studies of copy number and genotype genocn: integrated studies of copy number and genotype Sun, W., Wright, F., Tang, Z., Nordgard, S.H., Van Loo, P., Yu, T., Kristensen, V., Perou, C. February 22, 2010 1 Overview > library(genocn) This

More information

segmentseq: methods for detecting methylation loci and differential methylation

segmentseq: methods for detecting methylation loci and differential methylation segmentseq: methods for detecting methylation loci and differential methylation Thomas J. Hardcastle October 13, 2015 1 Introduction This vignette introduces analysis methods for data from high-throughput

More information

Package quantsmooth. R topics documented: May 4, Type Package

Package quantsmooth. R topics documented: May 4, Type Package Type Package Package quantsmooth May 4, 2018 Title Quantile smoothing and genomic visualization of array data Version 1.46.0 Date 2014-10-07 Author, Paul Eilers, Renee Menezes Maintainer

More information

Package evmix. February 9, 2018

Package evmix. February 9, 2018 Package evmix February 9, 2018 Title Extreme Value Mixture Modelling, Threshold Estimation and Boundary Corrected Kernel Density Estimation Version 2.9 Date 2018-02-08 Author Carl Scarrott, Yang Hu and

More information

USER S MANUAL FOR THE AMaCAID PROGRAM

USER S MANUAL FOR THE AMaCAID PROGRAM USER S MANUAL FOR THE AMaCAID PROGRAM TABLE OF CONTENTS Introduction How to download and install R Folder Data The three AMaCAID models - Model 1 - Model 2 - Model 3 - Processing times Changing directory

More information

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Exploring cdna Data Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Practical DNA Microarray Analysis http://compdiag.molgen.mpg.de/ngfn/pma0nov.shtml The following exercise will guide you

More information

Statistical Software Camp: Introduction to R

Statistical Software Camp: Introduction to R Statistical Software Camp: Introduction to R Day 1 August 24, 2009 1 Introduction 1.1 Why Use R? ˆ Widely-used (ever-increasingly so in political science) ˆ Free ˆ Power and flexibility ˆ Graphical capabilities

More information

Calibration of Quinine Fluorescence Emission Vignette for the Data Set flu of the R package hyperspec

Calibration of Quinine Fluorescence Emission Vignette for the Data Set flu of the R package hyperspec Calibration of Quinine Fluorescence Emission Vignette for the Data Set flu of the R package hyperspec Claudia Beleites CENMAT and DI3, University of Trieste Spectroscopy Imaging,

More information

Exploring cdna Data. Achim Tresch, Andreas Buness, Wolfgang Huber, Tim Beißbarth

Exploring cdna Data. Achim Tresch, Andreas Buness, Wolfgang Huber, Tim Beißbarth Exploring cdna Data Achim Tresch, Andreas Buness, Wolfgang Huber, Tim Beißbarth Practical DNA Microarray Analysis http://compdiag.molgen.mpg.de/ngfn/pma0nov.shtml The following exercise will guide you

More information

Package icnv. R topics documented: March 8, Title Integrated Copy Number Variation detection Version Author Zilu Zhou, Nancy Zhang

Package icnv. R topics documented: March 8, Title Integrated Copy Number Variation detection Version Author Zilu Zhou, Nancy Zhang Title Integrated Copy Number Variation detection Version 1.2.1 Author Zilu Zhou, Nancy Zhang Package icnv March 8, 2019 Maintainer Zilu Zhou Integrative copy number variation

More information

Canadian Bioinforma,cs Workshops.

Canadian Bioinforma,cs Workshops. Canadian Bioinforma,cs Workshops www.bioinforma,cs.ca Module #: Title of Module 2 Modified from Richard De Borja, Cindy Yao and Florence Cavalli R Review Objectives To review the basic commands in R To

More information

Display Lists in grid

Display Lists in grid Display Lists in grid Paul Murrell April 13, 2004 A display list is a record of drawing operations. It is used to redraw graphics output when a graphics window is resized, when graphics output is copied

More information

Tutorial for the WGCNA package for R II. Consensus network analysis of liver expression data, female and male mice

Tutorial for the WGCNA package for R II. Consensus network analysis of liver expression data, female and male mice Tutorial for the WGCNA package for R II. Consensus network analysis of liver expression data, female and male mice 2.b Step-by-step network construction and module detection Peter Langfelder and Steve

More information

Package visualizationtools

Package visualizationtools Package visualizationtools April 12, 2011 Type Package Title Package contains a few functions to visualize statistical circumstances. Version 0.2 Date 2011-04-06 Author Thomas Roth Etienne Stockhausen

More information

The first one will centre the data and ensure unit variance (i.e. sphere the data):

The first one will centre the data and ensure unit variance (i.e. sphere the data): SECTION 5 Exploratory Projection Pursuit We are now going to look at an exploratory tool called projection pursuit (Jerome Friedman, PROJECTION PURSUIT METHODS FOR DATA ANALYSIS, June. 1980, SLAC PUB-2768)

More information

3. Probability 51. probability A numerical value between 0 and 1 assigned to an event to indicate how often the event occurs (in the long run).

3. Probability 51. probability A numerical value between 0 and 1 assigned to an event to indicate how often the event occurs (in the long run). 3. Probability 51 3 Probability 3.1 Key Definitions and Ideas random process A repeatable process that has multiple unpredictable potential outcomes. Although we sometimes use language that suggests that

More information

Package BubbleTree. December 28, 2018

Package BubbleTree. December 28, 2018 Type Package Package BubbleTree December 28, 2018 Title BubbleTree: an intuitive visualization to elucidate tumoral aneuploidy and clonality in somatic mosaicism using next generation sequencing data Version

More information

Package EnQuireR. R topics documented: February 19, Type Package Title A package dedicated to questionnaires Version 0.

Package EnQuireR. R topics documented: February 19, Type Package Title A package dedicated to questionnaires Version 0. Type Package Title A package dedicated to questionnaires Version 0.10 Date 2009-06-10 Package EnQuireR February 19, 2015 Author Fournier Gwenaelle, Cadoret Marine, Fournier Olivier, Le Poder Francois,

More information

Bioinformatics - Homework 1 Q&A style

Bioinformatics - Homework 1 Q&A style Bioinformatics - Homework 1 Q&A style Instructions: in this assignment you will test your understanding of basic GWAS concepts and GenABEL functions. The materials needed for the homework (two datasets

More information

Package DPBBM. September 29, 2016

Package DPBBM. September 29, 2016 Type Package Title Dirichlet Process Beta-Binomial Mixture Version 0.2.5 Date 2016-09-21 Author Lin Zhang Package DPBBM September 29, 2016 Maintainer Lin Zhang Depends R (>= 3.1.0)

More information

Introduction to R. Biostatistics 615/815 Lecture 23

Introduction to R. Biostatistics 615/815 Lecture 23 Introduction to R Biostatistics 615/815 Lecture 23 So far We have been working with C Strongly typed language Variable and function types set explicitly Functional language Programs are a collection of

More information

Setup and analysis using a publicly available MLST scheme

Setup and analysis using a publicly available MLST scheme BioNumerics Tutorial: Setup and analysis using a publicly available MLST scheme 1 Introduction In this tutorial, we will illustrate the most common usage scenario of the MLST online plugin, i.e. when you

More information

Package SpecHelpers. July 26, 2017

Package SpecHelpers. July 26, 2017 Type Package Title Spectroscopy Related Utilities Version 0.2.7 Date 2017-07-26 Package SpecHelpers July 26, 2017 Author Bryan A. Hanson DePauw University, Greencastle Indiana USA Maintainer Bryan A. Hanson

More information

Working with ChIP-Seq Data in R/Bioconductor

Working with ChIP-Seq Data in R/Bioconductor Working with ChIP-Seq Data in R/Bioconductor Suraj Menon, Tom Carroll, Shamith Samarajiwa September 3, 2014 Contents 1 Introduction 1 2 Working with aligned data 1 2.1 Reading in data......................................

More information

Intersecting Frame (Photoshop)

Intersecting Frame (Photoshop) Intersecting Frame (Photoshop) Tip of the Week by Jen White on October 4, 2011 Sometimes you feel like a nut. Sometimes you don t. I ve got that little Almond Joy jingle stuck in my head! It was driving

More information

Bioconductor exercises 1. Exploring cdna data. June Wolfgang Huber and Andreas Buness

Bioconductor exercises 1. Exploring cdna data. June Wolfgang Huber and Andreas Buness Bioconductor exercises Exploring cdna data June 2004 Wolfgang Huber and Andreas Buness The following exercise will show you some possibilities to load data from spotted cdna microarrays into R, and to

More information

Package KEGGprofile. February 10, 2018

Package KEGGprofile. February 10, 2018 Type Package Package KEGGprofile February 10, 2018 Title An annotation and visualization package for multi-types and multi-groups expression data in KEGG pathway Version 1.20.0 Date 2017-10-30 Author Shilin

More information

8. Introduction to R Packages

8. Introduction to R Packages 8. Introduction to R Packages Ken Rice & David Reif University of Washington & North Carolina State University NYU Abu Dhabi, January 2019 In this session Base R includes pre-installed packages that allow

More information

Package TROM. August 29, 2016

Package TROM. August 29, 2016 Type Package Title Transcriptome Overlap Measure Version 1.2 Date 2016-08-29 Package TROM August 29, 2016 Author Jingyi Jessica Li, Wei Vivian Li Maintainer Jingyi Jessica

More information

ASAP - Allele-specific alignment pipeline

ASAP - Allele-specific alignment pipeline ASAP - Allele-specific alignment pipeline Jan 09, 2012 (1) ASAP - Quick Reference ASAP needs a working version of Perl and is run from the command line. Furthermore, Bowtie needs to be installed on your

More information

Advanced Graphics with R

Advanced Graphics with R Advanced Graphics with R Paul Murrell Universitat de Barcelona April 30 2009 Session overview: (i) Introduction Graphic formats: Overview and creating graphics in R Graphical parameters in R: par() Selected

More information

Package beanplot. R topics documented: February 19, Type Package

Package beanplot. R topics documented: February 19, Type Package Type Package Package beanplot February 19, 2015 Title Visualization via Beanplots (like Boxplot/Stripchart/Violin Plot) Version 1.2 Date 2014-09-15 Author Peter Kampstra Maintainer Peter Kampstra

More information

Chapter 5 An Introduction to Basic Plotting Tools

Chapter 5 An Introduction to Basic Plotting Tools Chapter 5 An Introduction to Basic Plotting Tools We have demonstrated the use of R tools for importing data, manipulating data, extracting subsets of data, and making simple calculations, such as mean,

More information

STATISTICAL LABORATORY, April 30th, 2010 BIVARIATE PROBABILITY DISTRIBUTIONS

STATISTICAL LABORATORY, April 30th, 2010 BIVARIATE PROBABILITY DISTRIBUTIONS STATISTICAL LABORATORY, April 3th, 21 BIVARIATE PROBABILITY DISTRIBUTIONS Mario Romanazzi 1 MULTINOMIAL DISTRIBUTION Ex1 Three players play 1 independent rounds of a game, and each player has probability

More information

Raman Spectra of Chondrocytes in Cartilage: hyperspec s chondro data set

Raman Spectra of Chondrocytes in Cartilage: hyperspec s chondro data set Raman Spectra of Chondrocytes in Cartilage: hyperspec s chondro data set Claudia Beleites DIA Raman Spectroscopy Group, University of Trieste/Italy (2 28) Spectroscopy Imaging,

More information

Contents. Introduction 2

Contents. Introduction 2 R code for The human immune system is robustly maintained in multiple stable equilibriums shaped by age and cohabitation Vasiliki Lagou, on behalf of co-authors 18 September 2015 Contents Introduction

More information

An Introduction to Some Graphics in Bioconductor

An Introduction to Some Graphics in Bioconductor n Introduction to ome raphics in ioconductor une 4, 2003 Introduction e first need to set up the basic data regarding the genome of interest. The chrom- ocation class describes the necessary components

More information

Disease prediction in the at-risk mental state for psychosis using neuroanatomical biomarkers: results from the FePsy-study. Supplementary material

Disease prediction in the at-risk mental state for psychosis using neuroanatomical biomarkers: results from the FePsy-study. Supplementary material Disease prediction in the at-risk mental state for psychosis using neuroanatomical biomarkers: results from the FePsy-study. Nikolaos Koutsouleris a,ca, MD; Stefan Borgwardt b, MD; Eva M. Meisenzahl, MD;

More information

Simulation studies of module preservation: Simulation study of weak module preservation

Simulation studies of module preservation: Simulation study of weak module preservation Simulation studies of module preservation: Simulation study of weak module preservation Peter Langfelder and Steve Horvath October 25, 2010 Contents 1 Overview 1 1.a Setting up the R session............................................

More information

Graph tool instructions and R code

Graph tool instructions and R code Graph tool instructions and R code 1) Prepare data: tab-delimited format Data need to be inputted in a tab-delimited format. This can be easily achieved by preparing the data in a spread sheet program

More information

Tutorial for the WGCNA package for R II. Consensus network analysis of liver expression data, female and male mice. 1. Data input and cleaning

Tutorial for the WGCNA package for R II. Consensus network analysis of liver expression data, female and male mice. 1. Data input and cleaning Tutorial for the WGCNA package for R II. Consensus network analysis of liver expression data, female and male mice 1. Data input and cleaning Peter Langfelder and Steve Horvath February 13, 2016 Contents

More information