MiChip. Jonathon Blake. October 30, Introduction 1. 5 Plotting Functions 3. 6 Normalization 3. 7 Writing Output Files 3

Similar documents
Microarray Data Analysis (V) Preprocessing (i): two-color spotted arrays

Organizing, cleaning, and normalizing (smoothing) cdna microarray data

PROCEDURE HELP PREPARED BY RYAN MURPHY

Course on Microarray Gene Expression Analysis

Package TilePlot. February 15, 2013

Package SCAN.UPC. October 9, Type Package. Title Single-channel array normalization (SCAN) and University Probability of expression Codes (UPC)

Package TilePlot. April 8, 2011

Introduction to the Bioconductor marray package : Input component

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Florian Hahne, Wolfgang Huber. June 17, 2005

CARMAweb users guide version Johannes Rainer

Normalization: Bioconductor s marray package

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber

Exploring cdna Data. Achim Tresch, Andreas Buness, Wolfgang Huber, Tim Beißbarth

Preprocessing -- examples in microarrays

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber

Agilent CytoGenomics 2.0 Feature Extraction for CytoGenomics

TIGR ExpressConverter

Codelink Legacy: the old Codelink class

Bioconductor s stepnorm package

Methodology for spot quality evaluation

Introduction to the Codelink package

What does analyze.itraq( )?

Package agilp. R topics documented: January 22, 2018

Agilent Genomic Workbench Feature Extraction 10.10

GPR Analyzer version 1.23 User s Manual

Package frma. R topics documented: March 8, Version Date Title Frozen RMA and Barcode

All About PlexSet Technology Data Analysis in nsolver Software

Building R objects from ArrayExpress datasets

Agilent Feature Extraction Software (v10.5)

Analyse RT PCR data with ddct

Frozen Robust Multi-Array Analysis and the Gene Expression Barcode

Import GEO Experiment into Partek Genomics Suite

Automated Bioinformatics Analysis System on Chip ABASOC. version 1.1

Description of gcrma package

Package AffyExpress. October 3, 2013

The analysis of rtpcr data

Feature Extraction 11.5

ROTS: Reproducibility Optimized Test Statistic

sscore July 13, 2010 vector of data tuning constant (see details) fuzz value to avoid division by zero (see details)

Microarray Data Analysis (VI) Preprocessing (ii): High-density Oligonucleotide Arrays

Vector Xpression 3. Speed Tutorial: III. Creating a Script for Automating Normalization of Data

AB1700 Microarray Data Analysis

Package SCAN.UPC. July 17, 2018

Package PGSEA. R topics documented: May 4, Type Package Title Parametric Gene Set Enrichment Analysis Version 1.54.

Microarray Excel Hands-on Workshop Handout

MATH3880 Introduction to Statistics and DNA MATH5880 Statistics and DNA Practical Session Monday, 16 November pm BRAGG Cluster

Image Analysis begins with loading an image into GenePix Pro, and takes you through all the analysis steps required to extract data from the image.

How to store and visualize RNA-seq data

Applying Data-Driven Normalization Strategies for qpcr Data Using Bioconductor

Bioconductor exercises 1. Exploring cdna data. June Wolfgang Huber and Andreas Buness

Package sscore. R topics documented: June 27, Version Date

Package ChIPXpress. October 4, 2013

CLC Server. End User USER MANUAL

TECH NOTE Improving the Sensitivity of Ultra Low Input mrna Seq

Analysis of multi-channel cell-based screens

How do microarrays work

DREM. Dynamic Regulatory Events Miner (v1.0.9b) User Manual

Introduction to GE Microarray data analysis Practical Course MolBio 2012

RPPanalyzer (Version 1.0.3) Analyze reverse phase protein array data User s Guide

Package OLIN. September 30, 2018

Nature Publishing Group

The crosshybdetector Package

Release Notes. JMP Genomics. Version 4.0

From raw data to gene annotations

Introduction to Minitab 1

Preprocessing Affymetrix Data. Educational Materials 2005 R. Irizarry and R. Gentleman Modified 21 November, 2009, M. Morgan

Using FARMS for summarization Using I/NI-calls for gene filtering. Djork-Arné Clevert. Institute of Bioinformatics, Johannes Kepler University Linz

Package Risa. November 28, 2017

Package ffpe. October 1, 2018

Package INCATome. October 5, 2017

SHARPR (Systematic High-resolution Activation and Repression Profiling with Reporter-tiling) User Manual (v1.0.2)

Package affyplm. January 6, 2019

Exploring gene expression datasets

The Brigham and Women's Hospital, Inc. and President and Fellows of Harvard College

NacoStringQCPro. Dorothee Nickles, Thomas Sandmann, Robert Ziman, Richard Bourgon. Modified: April 10, Compiled: April 24, 2017.

Using metama for differential gene expression analysis from multiple studies

A manual for the use of mirvas

QAnalyzer 1.0c. User s Manual. Features

affyqcreport: A Package to Generate QC Reports for Affymetrix Array Data

STEM. Short Time-series Expression Miner (v1.1) User Manual

Using seqtools package

Analysis of (cdna) Microarray Data: Part I. Sources of Bias and Normalisation

Categorized software tools: (this page is being updated and links will be restored ASAP. Click on one of the menu links for more information)

Using oligonucleotide microarray reporter sequence information for preprocessing and quality assessment

Genome Environment Browser (GEB) user guide

Package NanoStringQCPro

Package RmiR. R topics documented: September 26, 2018

Introduction: microarray quality assessment with arrayqualitymetrics

Why use R? Getting started. Why not use R? Introduction to R: Log into tak. Start R R or. It s hard to use at first

End-to-end analysis of cell-based screens: from raw intensity readings to the annotated hit list (Short version)

End-to-end analysis of cell-based screens: from raw intensity readings to the annotated hit list (Short version)

Agilent Genomic Workbench 7.0

Using R for statistics and data analysis

MAGE-ML: MicroArray Gene Expression Markup Language

Package splicegear. R topics documented: December 4, Title splicegear Version Author Laurent Gautier

How to use the DEGseq Package

Textual Description of webbioc

Rsubread package: high-performance read alignment, quantification and mutation discovery

Expander 7.2 Online Documentation

Gene Survey: FAQ. Gene Survey: FAQ Tod Casasent DRAFT

Transcription:

MiChip Jonathon Blake October 30, 2018 Contents 1 Introduction 1 2 Reading the Hybridization Files 1 3 Removing Unwanted Rows and Correcting for Flags 2 4 Summarizing Intensities 3 5 Plotting Functions 3 6 Normalization 3 7 Writing Output Files 3 8 Combination of processes 5 1 Introduction MiChip is a microarray platform using locked oligonucleotides for the analysis of the expression of micrornas in a variety of species Castoldi et al. (2008). The MiChip library provides a set of functions for loading data from several MiChip hybridizations, flag correction, filtering and summarizing the data. The data is then packaged as a Bioconductor ExpressionSet object where it can easily be further analyzed with the Bioconductor toolset http://www.bioconductor.org/. First load the library. > library(michip) 2 Reading the Hybridization Files MiChip is scanned as a single colour cy3 hybridization and the output is gridded using Genepix software. To load the data from a set of MiChip hybridization Genepix files into bioconductor, use the parserawdata() method. > datadir <-system.file("extdata", package="michip") > defaultrawdata <- parserawdata(datadir) 2 files found Dealing with file Dealing with file B_676.gpr B_677.gpr 1

The defaults are current directory. And the gpr file extension. Loading data from a directory other than the current directory requires sending the directory to the method e.g. otherdirectorydata <-parserawdata(datadir= /mydemodir, pat = gpr ). All files in the directory with the matching extension will be parsed and combined into an ExpressionSet containing all features on the chip with the background subtracted intensity from the scanner and quality flags. All hybridizations in the directory should be of the same type otherwise an error will be thrown. 3 Removing Unwanted Rows and Correcting for Flags Due to the spotting configuration of MiChip and the probes supplied in the Exiqon probe library there are several data points which can be removed from the data set before analysis. Some of the spots on the chip are empty, others contain various controls and probes relating to micrornas from different species possibly not relevant to the analysis in at hand. To remove data from these points use the removeunwantedrows() method. This takes an array of strings and removes rows containing any of these strings in the gene name annotation of the data. Remove all empty spots from data set > noemptiesdataset <- removeunwantedrows(defaultrawdata, c("empty")) Raw Data 4608 Number of filters 1 Filtering special case empties Filtered Data 4504 Use the helper method to produce the standard set of data rows for human MiChip experiments > humandataset <- standardremoverows(defaultrawdata) Raw Data 4608 Number of filters 13 Filtering special case empties Filtered Data 1118 Flags for the MiChip hybridizations are 0 for passes and negative values for spots that are marked absent. Data points with flag values less than zero are set to NA using the correctforflags method. > flagcorrecteddataset <- correctforflags(humandataset) B_676 had 55 bad flags from 1118 data points B_677 had 54 bad flags from 1118 data points Positive but low intensities may lead to readings near background being taken as positive. Therefore an intensity cutoff can be sent to the correctforflags() to set all the intensities under a set value to NA. > flagcorrecteddataset <- correctforflags(humandataset, intensitycutoff = 50) B_676 had 55 bad flags and 77 values below cutoff from 1118 data points B_677 had 54 bad flags and 86 values below cutoff from 1118 data points 2

4 Summarizing Intensities The MiChip probes are spotted in either duplicate or quadruplicate on the array. The individual readings of the data can be combined to give a single intensity value. The combined intensity is taken as the median of the individual intensities, omitting NAs. A minimum length for the acceptable number of present values is supplied to prevent features with only a low number of positive calls being accepted. Summarized intensities where the median absolute deviation is greater than the median intensity can be set to NA on the grounds of being too variable. This is done by setting the madadjust argument to TRUE. > summeddata <- summarizeintensitiesasmedian(flagcorrecteddataset,minsumlength = 0, madadjust=fals 5 Plotting Functions MiChip contains two functions for plotting intensity data, both are wrappers for standard plotting functions. The data however produced is written to a file allowing intensity plots and box plots to be produced automatically. > plotintensitiesscatter(exprs(summeddata), NULL, "MiChipDemX", "SummarizedScatter") File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/michipdemx_summarizedscatter.jpg Figure 1 shows scatter plots of the intensites of the hybrdizations. > boxplotdata(exprs(summeddata), "MiChipDemX", "Summarized") File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/michipdemx_summarized.jpg used (Mb) gc trigger (Mb) max used (Mb) Ncells 652881 34.9 1263902 67.5 1263902 67.5 Vcells 1355193 10.4 8388608 64.0 4548125 34.7 Figure 2 shows boxplots of the summarized intensity data. 6 Normalization The major advantage of the MiChip library is to parse MiChip hybridization data sets into an ExpressionSet so that existing methods for normalization and hybridization within Bioconductor can be used. Median normalization per chip is implemented in the MiChip. > mednormeddata <- normalizeperchipmedian(summeddata) 7 Writing Output Files The outputannotateddatamatrix() method combines the annotation and expression data from an ExpressionSet. This produces a tab delimited file containing data annotation in the left hand columns and expression data in the right for distribution or analysis with other applications. > outputannotateddatamatrix(mednormeddata, "MiChipDemo", "mednormedintensity", "exprs") File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/michipdemo_mednormedintensity.tx 3

Figure 1: Scatterplots of pairwise intensies per hybridization 4

Figure 2: Boxplot of Summarized Intensity Data 8 Combination of processes The MiChip library has been developed to automate and simplify the analysis of MiChip hybridizations and provide a basis for incorporating the MiChip into analysis pipelines. A worked example of the analysis from file parsing to median normalization of the expression data is given in the workedexamplemedian- Normalization method. > datadir <-system.file("extdata", package="michip") > mynormedeset <- workedexamplemediannormalize("normeddemo", intensitycutoff = 50,datadir) File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/normeddemo_flagcorrectedintensit File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/normeddemo_summarizedintensity.t 2 files found Dealing with file B_676.gpr Dealing with file B_677.gpr Raw Data 4608 Number of filters 13 Filtering special case empties Filtered Data 1118 File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/normeddemo_filtereddata.txt B_676 had 55 bad flags and 77 values below cutoff from 1118 data points B_677 had 54 bad flags and 86 values below cutoff from 1118 data points File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/normeddemo_flagcorrected.jpg File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/normeddemo_summarized.jpg 5

File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/normeddemo_summarizedscatter.jpg File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/normeddemo_median Normalized.jpg File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/normeddemo_mednormedscatter.jpg File saved to /tmp/rtmpdmfguc/rbuild3b8b3beef1d1/michip/vignettes/normeddemo_mednormedintensity.tx References M. Castoldi, S. Schmidt, V. Benes, M. W. Hentze, and M. U. Muckenthaler. michip: an array-based method for microrna expression profiling using locked nucleic acid capture probes. Nature Protocols, 3 (2):321 329, 2008. 1 M. Castoldi, S. Schmidt, V. Benes, M Noerholm, A. E. Kulozik, M. W. Hentze, and M. U. Muckenthaler. A sensitive array for microrna expression profiling (michip) based on locked nucleic acids (lna). RNA, 12(5):913 920, 2006. 6