Raman Spectra of Chondrocytes in Cartilage: hyperspec s chondro data set

Similar documents
Raman Spectra of Chondrocytes in Cartilage: hyperspec s chondro data set

Calibration of Quinine Fluorescence Emission Vignette for the Data Set flu of the R package hyperspec

Fitting Baselines to Spectra

Chemometric Analysis of Bio-Spectroscopic Data in : hyperspec

hyperspec Introduction

Examples of implementation of pre-processing method described in paper with R code snippets - Electronic Supplementary Information (ESI)

hyperspec Plotting functions

Image Analysis with beadarray

Reproducible Homerange Analysis

Visualisation, transformations and arithmetic operations for grouped genomic intervals

survsnp: Power and Sample Size Calculations for SNP Association Studies with Censored Time to Event Outcomes

Count outlier detection using Cook s distance

Introduction to the TSSi package: Identification of Transcription Start Sites

Analysis of two-way cell-based assays

MIGSA: Getting pbcmc datasets

Shrinkage of logarithmic fold changes

CQN (Conditional Quantile Normalization)

segmentseq: methods for detecting methylation loci and differential methylation

Cardinal design and development

Using the qrqc package to gather information about sequence qualities

RMassBank for XCMS. Erik Müller. January 4, Introduction 2. 2 Input files LC/MS data Additional Workflow-Methods 2

segmentseq: methods for detecting methylation loci and differential methylation

The latest trend of hybrid instrumentation

NacoStringQCPro. Dorothee Nickles, Thomas Sandmann, Robert Ziman, Richard Bourgon. Modified: April 10, Compiled: April 24, 2017.

A complete analysis of peptide microarray binding data using the pepstat framework

riboseqr Introduction Getting Data Workflow Example Thomas J. Hardcastle, Betty Y.W. Chung October 30, 2017

An Overview of the S4Vectors package

Advanced analysis using bayseq; generic distribution definitions

SCnorm: robust normalization of single-cell RNA-seq data

How to Use pkgdeptools

Triform: peak finding in ChIP-Seq enrichment profiles for transcription factors

RNAseq Differential Expression Analysis Jana Schor and Jörg Hackermüller November, 2017

How to use CNTools. Overview. Algorithms. Jianhua Zhang. April 14, 2011

RAPIDR. Kitty Lo. November 20, Intended use of RAPIDR 1. 2 Create binned counts file from BAMs Masking... 1

Analysing Spatial Data in R: Vizualising Spatial Data

Introduction: microarray quality assessment with arrayqualitymetrics

ssviz: A small RNA-seq visualizer and analysis toolkit

Package superheat. February 4, 2017

Analysis of screens with enhancer and suppressor controls

Introduction to the Codelink package

Introduction: microarray quality assessment with arrayqualitymetrics

Getting started with flowstats

The analysis of rtpcr data

SimBindProfiles: Similar Binding Profiles, identifies common and unique regions in array genome tiling array data

MMDiff2: Statistical Testing for ChIP-Seq Data Sets

MALDIquant: Quantitative Analysis of Mass Spectrometry Data

Hyperspectral Chemical Imaging: principles and Chemometrics.

Simulation of molecular regulatory networks with graphical models

Using R to Make Sense of NMR Datasets

An Introduction to the methylumi package

Preliminary Figures for Renormalizing Illumina SNP Cell Line Data

Module 10. Data Visualization. Andrew Jaffe Instructor

Introduction to Raman spectroscopy measurement data processing using Igor Pro

Navigator. The Navigator

Raman Images. Jeremy M. Shaver 1, Eunah Lee 2, Andrew Whitley 2, R. Scott Koch. 1. Eigenvector Research, Inc. 2. HORIBA Jobin Yvon, Inc.

Analyse RT PCR data with ddct

An Introduction to ShortRead

DupChecker: a bioconductor package for checking high-throughput genomic data redundancy in meta-analysis

Comparing Least Squares Calculations

Many Patterns & Many Methods

Design Issues in Matrix package Development

How to use cghmcr. October 30, 2017

Raman Spectrometer Installation Manual

Renishaw invia Raman Microscope (April 2006)

Design Issues in Matrix package Development

SCnorm: robust normalization of single-cell RNA-seq data

Basic4Cseq: an R/Bioconductor package for the analysis of 4C-seq data

Package hyperspec. October 6, 2017

Using OPUS to Process Evolved Gas Data (8/12/15 edits highlighted)

How To Use GOstats and Category to do Hypergeometric testing with unsupported model organisms

PROPER: PROspective Power Evaluation for RNAseq

Using crlmm for copy number estimation and genotype calling with Illumina platforms

Multivariate Calibration Quick Guide

Chemometrics. Description of Pirouette Algorithms. Technical Note. Abstract

The bumphunter user s guide

Creating IGV HTML reports with tracktables

R-software multims-toolbox. (User Guide)

crlmm to downstream data analysis

Using metama for differential gene expression analysis from multiple studies

MLSeq package: Machine Learning Interface to RNA-Seq Data

SIBER User Manual. Pan Tong and Kevin R Coombes. May 27, Introduction 1

SELECTION OF A MULTIVARIATE CALIBRATION METHOD

Nature Methods: doi: /nmeth Supplementary Figure 1

GxE.scan. October 30, 2018

Introduction to R (BaRC Hot Topics)

Outlier and Target Detection in Aerial Hyperspectral Imagery: A Comparison of Traditional and Percentage Occupancy Hit or Miss Transform Techniques

bayseq: Empirical Bayesian analysis of patterns of differential expression in count data

Progenesis CoMet User Guide

Seminar III: R/Bioconductor

An Introduction to R 1.3 Some important practical matters when working with R

MRO CRISM TRR3 Hyperspectral Data Filtering

matter: Rapid prototyping with data on disk

Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools

RDAVIDWebService: a versatile R interface to DAVID

Package eemr. May 8, 2017

MultipleAlignment Objects

ISA internals. October 18, Speeding up the ISA iteration 2

Package densityclust

Quick Start Guide: Welcome to OceanView

Working with aligned nucleotides (WORK- IN-PROGRESS!)

Transcription:

Raman Spectra of Chondrocytes in Cartilage: hyperspec s chondro data set Claudia Beleites <chemometrie@beleites.de> CENMAT and DI3, University of Trieste Spectroscopy Imaging, IPHT Jena e.v. February 13, 2015 Reproducing this vignette The data set and source file of this vignette are available as a zipped archive at hyperspec s homepage, http://hyperspec.r-forge.r-project.org/blob/chondro.zip (ca. 9 MB). The original data file (ca. 31 MB) is far too large to be included in the package. In order to reproduce the examples by typing in the commands, have a look at the definitions of color palettes used in this document are defined in vignettes.defs. Also, the package hyperspec needs to be loaded first via library (hyperspec). Contents 1 Introduction 2 2 The Data Set 2 3 Data Import 3 4 Preprocessing 3 4.1 Aligning the Spatial Grid..................................... 4 4.2 Spectral Smoothing........................................ 4 4.3 Baseline Correction........................................ 5 4.4 Normalization............................................ 5 4.5 Subtracting the Overall Composition.............................. 6 4.6 Outlier Removal by Principal Component Analysis (PCA)................. 6 5 Hierarchical Cluster Analysis (HCA) 7 6 Plotting a False-Colour Map of Certain Spectral Regions 8 7 Saving the data set 9 1

Figure 1 Microphotograph of the cartilage section. The frame indicates the measurement area (35 25 m). 1 Introduction This vignette describes the chondro data set. It shows a complete data analysis work flow on a Raman map demonstrating frequently needed preprocessing methods ❼ baseline correction ❼ normalization ❼ smoothing / interpolating spectra ❼ preprocessing the spatial grid and other basic work techniques ❼ plotting spectra ❼ plotting false color maps ❼ cutting the spectral range, ❼ selecting (extracting) or deleting spectra, and ❼ aggregating spectra (e.g. calculating cluster mean spectra). The chemometric methods used are ❼ Principal Component Analysis (PCA) and ❼ Hierarchical Cluster Analysis, showing how to use data analysis procedures provided by R and other packages. 2 The Data Set Raman spectra of a cartilage section were measured on each point of a grid, resulting in a so-called Raman map. Figure 1 shows a microscope picture of the measured area and its surroundings. The measurement parameters were: Excitation wavelength: 633 nm Exposure time: 10 s per spectrum Objective: 100, NA 0.85 Measurement grid: 35 25 m, 1 m step size Spectrometer: Renishaw InVia 2

4.5 Subtracting the Overall Composition The spectra are very homogeneous, but I m interested in the differences between the different regions of the sample. Subtracting the minimum spectrum cancels out the matrix composition that is common to all spectra. But the minimum spectrum also picks up a lot of noise. So instead, the 5 th percentile spectrum is subtracted: > chondro <- chondro - quantile (chondro, 0.05) > plot (chondro, "spcprctl5") The resulting data set is shown in figure 4b. Some interesting differences start to show up: there are distinct lipid bands in some but not all of the spectra. 4.6 Outlier Removal by Principal Component Analysis (PCA) PCA is a technique that decomposes the data into scores and loadings (virtual spectra). It is known to be quite sensitive to outliers. Thus, I use it for outlier detection. The resulting scores and loadings are put again into hyperspec objects by decomposition: > pca <- prcomp (chondro, center = TRUE) > scores <- decomposition (chondro, pca$x, label.wavelength = "PC", + label.spc = > loadings <- decomposition (chondro, t(pca$rotation), scores = FALSE, + lab Plotting the scores of each PC against all other gives a good idea where to look for outliers. > pairs (scores [[,,1:20]], pch = 19, cex = 0.5) Now the spectra can be found either by plotting two scores against each other (by plot) and identifying with identify, or they can be identified in the score map by map.identify. There is also a function to identify spectra in a spectra plot, spc.identify, which could be used to identify principal components that are heavily influenced e. g. by cosmic ray spikes. > ## omit the first 4 PCs > out <- map.identify (scores [,,5]) > out <- c (out, map.identify (scores [,,6])) > out <- c (out, map.identify (scores [,,7])) > out [1] 105 140 216 289 75 69 > outcols <- c ("red", "blue", "#800080", "orange", "magenta", "brown") > cols <- rep ("black", nrow(chondro)) > cols [out] <- outcols We can check our findings by comparing the spectra to the bulk of spectra (figure 5a): > plot(chondro[1], plot.args = list (ylim = c (1, length (out) +.7)), + lines.args = list( type = "n")) > for (i in seq (along = out)){ + plot(chondro, "spcprctl5", yoffset = i, add = TRUE, col = "gray") + plot (chondro [out[i]], yoffset = i, col = outcols[i], add = TRUE, + lines.args = list (lwd = 2)) + text (600, i +.33, out [i]) + } 6

Session Info R version 3.1.2 (2014-10-31) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8 [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8 [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] grid methods stats graphics grdevices utils datasets base other attached packages: [1] latticeextra_0.6-26 RColorBrewer_1.0-5 hyperspec_0.98-20150212 mvtnorm_0.9-9997 [5] ggplot2_0.9.3.1 lattice_0.20-29 loaded via a namespace (and not attached): [1] colorspace_1.2-2 dichromat_2.0-0 digest_0.6.4 gtable_0.1.2 labeling_0.1 [6] MASS_7.3-37 munsell_0.4.2 plyr_1.8 proto_0.3-10 reshape2_1.2.2 [11] scales_0.2.3 stringr_0.6.2 tools_3.1.2 10