SIMAT: GC-SIM-MS Analayis Tool

Similar documents
Tutorial 2: Analysis of DIA/SWATH data in Skyline

Retention Time Locking with the MSD Productivity ChemStation. Technical Overview. Introduction. When Should I Lock My Methods?

UPDATING PESTICIDE RETENTION TIME LIBRARIES FOR THE AGILENT INTUVO 9000 GC

New Features of Deconvolution Reporting Software Revision A.02 Technical Note

TraceFinder Analysis Quick Reference Guide

Thermo Xcalibur Getting Started (Quantitative Analysis)

To get started download the dataset, unzip files, start Maven, and follow steps below.

Package CorrectOverloadedPeaks

Quick Guide to the Star Bar

Agilent MassHunter Workstation Software 7200 Accurate-Mass Quadrupole Time of Flight GC/MS

MRMPROBS tutorial. Hiroshi Tsugawa RIKEN Center for Sustainable Resource Science MRMPROBS screenshot

Building Agilent GC/MSD Deconvolution Reporting Libraries for Any Application Technical Overview

Xcalibur Library Browser

MRMPROBS tutorial. Hiroshi Tsugawa RIKEN Center for Sustainable Resource Science

Package Metab. September 18, 2018

Statistical Process Control in Proteomics SProCoP

QuiC 1.0 (Owens) User Manual

Fatty Acid Methyl Ester (FAME) RTL Databases for GC and GC/MS

Progenesis CoMet User Guide

MS-FINDER tutorial. Last edited in Aug 10, 2018

LC-MS Data Pre-Processing. Xiuxia Du, Ph.D. Department of Bioinformatics and Genomics University of North Carolina at Charlotte

Package cosmiq. April 11, 2018

Agilent MassHunter Qualitative Data Analysis

MassHunter File Reader

Tutorial 7: Automated Peak Picking in Skyline

Agilent 6300 Ion Trap LC/MS Systems Quick Start Guide

The Agilent 5975 inert MSD: New Tools for the Forensic Analyst

Agilent 6400 Series Triple Quadrupole LC/MS System

TraceFinder Shortcut Menus Quick Reference Guide

Skyline MS1 Full Scan Filtering

Agilent RapidFire 300 High-throughput Mass Spectrometry System

REVIEW. MassFinder 3. Navigation of the Chromatogram

Package HiResTEC. August 7, 2018

CLARITY DEMO. Clarity Demo Software. Code/Rev.: M003/70E Date:

Tutorials. Saturn GC/MS Workstation Version 5.4. Varian Analytical Instruments 2700 Mitchell Drive Walnut Creek, CA /usa

TraceFinder Analysis Quick Reference Guide

MIMA V 1.0. MS IMS Mapper. Peak Identification in Ion mobility Spectrometry

OpenLynx User's Guide

Chromeleon / MSQ Plus Operator s Guide Document No Revision 03 October 2009

MS-FINDER tutorial. Last edited in Sep. 10, 2018

Agilent GC/MSD Instructions

Pathway Analysis of Untargeted Metabolomics Data using the MS Peaks to Pathways Module

Progenesis CoMet User Guide

Introduction to Time-of-Flight Mass

What s New in Empower 3

MC Prochloraz, [phenyl- 14 C]- Lot A JH

Progenesis QI for proteomics User Guide. Analysis workflow guidelines for DDA data

Analysis of GC-MS metabolomics data with metams

PRM Method Development and Data Analysis with Skyline. With

Annotation of LC-MS metabolomics datasets by the metams package

ChromQuest 5.0 Quick Reference Guide

Finnigan Xcalibur. Getting Productive: Qualitative Analysis. XCALI Revision B March 2006

High-throughput Processing and Analysis of LC-MS Spectra

Package envipick. June 6, 2016

Real-time Data Compression for Mass Spectrometry Jose de Corral 2015 GPU Technology Conference

MassBank for NORMAN. Achievements in 2011 and further proposed steps for identification of unknowns

This manual describes step-by-step instructions to perform basic operations for data analysis.

Skyline Targeted MS/MS

Getting Started. Environmental Analysis Software

Agilent 5977 Series MSD System

Protein Deconvolution Quick Start Guide

MassHunter Pesticides PCD or PCDL Quick Start Guide

Agilent Triple Quadrupole LC/MS Peptide Quantitation with Skyline

Using OPUS to Process Evolved Gas Data (8/12/15 edits highlighted)

How to Run the CASPiE GCMS

QuiC 2.1 (Senna) User Manual

Automated Submission for Analysis Processing (ASAP) Operator s Guide Dionex Corporation

Corra v2.0 User s Guide

Agilent G6854AA MassHunter Personal Pesticide Database Kit Quick Start Guide

Using SIMDIS Expert 9 (Separation Systems) with Clarity

LTQ Velos SP3 Release Notes

The TargetSearch Package

Reference Manual MarkerView Software Reference Manual Revision: February, 2010

Skyline Targeted Method Refinement

MC Lot A MW

Processing Data/Library Searching Data on PC s. Introduction

Agilent G1701DA GC/MSD ChemStation

Agilent G6855AA MassHunter Personal


Labelled quantitative proteomics with MSnbase

SpectroDive 8 - Coelacanth. User Manual

Agilent G1732AA MSD Security ChemStation

PROTEOMIC COMMAND LINE SOLUTION. Linux User Guide December, B i. Bioinformatics Solutions Inc.

Skyline irt Retention Time Prediction

Large and Sparse Mass Spectrometry Data Processing in the GPU Jose de Corral 2012 GPU Technology Conference

An introduction to Baitmet package

Xcalibur. Getting Productive: Quantitative Analysis. XCALI Revision C June For Research Use Only Not for use in Diagnostic Procedures

EZ IQ. Chromatography Software PN P1A

Agilent 6100 Series Quad LC/MS System

Default output format. Conversi on software. Conversion output formats

Preparing to analyze LC MS data. Synopsis

Skyline High Resolution Metabolomics (Draft)

GC-MS ChemStation Reference Guide. Last Updated July 2014

Information Note 2016_02 January 2016

Agilent OpenLAB CDS. Version 2.2. Release Notes. OpenLAB CDS Release Notes 1

Skyline Targeted Method Refinement

Chromeleon software orientation

TraceFinder Administrator Quick Reference Guide

Agilent G1701DA GC/MSD ChemStation

MS data processing. Filtering and correcting data. W4M Core Team. 22/09/2015 v 1.0.0

Transcription:

SIMAT: GC-SIM-MS Analayis Tool Mo R. Nezami Ranjbar September 10, 2018 1 Selected Ion Monitoring Gas chromatography coupled with mass spectrometry (GC-MS) is one of the promising technologies for qualitative and quantitative analysis of small biomolecules. Because of the existence of spectral libraries, GC-MS instruments can be set up efficiently for targeted analysis. Also, to increase sensitivity, samples can be analyzed in selected ion monitoring (SIM) mode. While many software have been provided for analysis of untargeted GC-MS data, no specific tool does exist for processing of GC-MS data acquired with SIM. 2 SIMAT package SIMAT is a tool for analysis of GC-MS data acquired in SIM mode. The tool provides several functions to import raw GC-SIM-MS data and standard format mass spectral libraries. It also provides guidance for fragment selection before running the targeted experiment in SIM mode by using optimization. This is done by considering overlapping peaks from a library provided by the user. Other functionalities include retention index calibration to improve target identification and plotting EICs of individual peaks in specific runs which can be used for visual assessment. In summary, the package has several capabilities, including: ˆ Processing gas chromatography coupled with mass spectrometry data acquired in selected ion monitoring (SIM) mode. ˆ Peak detection and identification. ˆ Similarity score calculation. ˆ Retention index (RI) calibration. ˆ Reading NIST mass spectral library (MSL) format. ˆ Importing netcdf raw files. ˆ EIC and TIC visualization 1

ˆ Providing guidance in choosing appropriate fragments for the targets of interest by using an optimization algorithm. 3 Examples Here, we provide some examples of the usage of different functions in the SIMAT package. After installation, we start by loading the package and example data sets included in the SIMAT library. > # load the package > library(simat) > # load the extracted data from a CDF file of a SIM run > data(run) > # load the target table information > data(target.table) > # load the background library to be used with fragment selection > data(library) > # load retention index table from RI standards > data(ritable) First let us examine the contents of the loaded data sets. Please note that you can find more details by checking the manual page for each data set. Starting with Run we can see that this object is a list, including four items, retention time, scans, scan information (in the pk filed), and the TIC data. We can also check some values for each field: > # check the names of different fields in Run > names(run) [1] "rt" "sc" "tic" "pk" > # show some values for the the first three fields > head(as.data.frame(run[c("rt", "sc", "tic")])) rt sc tic 1 359.233 1 25898079 2 359.491 2 24634398 3 359.749 3 24210034 4 360.008 4 23703436 5 360.266 5 23286408 6 360.524 6 22603404 > # see what is included in the scan information for the first scan > Run$pk[[1]] mz intensity [1,] 55 108216 2

[2,] 56 171072 [3,] 66 10519 [4,] 72 1361408 [5,] 73 5509632 [6,] 74 633600 [7,] 75 763776 [8,] 98 24528 [9,] 117 609280 [10,] 118 169408 [11,] 120 132160 [12,] 121 68560 [13,] 122 5632 [14,] 146 8388096 [15,] 147 2875904 [16,] 148 1116672 [17,] 156 4714 [18,] 170 2638 [19,] 171 27576 [20,] 190 69376 [21,] 191 3803648 [22,] 194 41664 We can also plot the total ion chromatogram (TIC) of the run: > # plot the TIC of the selected Run > plottic(run = Run) 3

2e+07 TIC 1e+07 0e+00 10 20 30 Time (min) For the target table information, which is a list object, we have three fileds, i.e. compound, ms, and numfrag: > # check the name of included fields > names(target.table) [1] "compound" "ms" "numfrag" > # check the first lines of the target.table > head(as.data.frame(target.table[c("compound", "numfrag")])) compound numfrag 1 Analyte 6 4 2 Analyte 33 4 3 Analyte 91 4 4 Analyte 104 4 5 Analyte 120 4 6 Analyte 135 4 4

> # check the contents of the ms field > target.table$ms[[1]] [1] 56 98 170 171 Similarly, for the library: > # check the name of included fields > names(library) [1] "compound" "rt" "ri" "ms" "sp" > # check the first lines of Library > head(as.data.frame(library[c("rt", "ri", "compound")])) rt ri compound 1 6.332 696.9 Analyte 4 2 6.361 699.0 Analyte 5 3 6.388 700.8 Analyte 6 4 6.420 703.1 Analyte 7 5 6.458 705.7 Analyte 8 6 7.504 778.6 Analyte 29 > # check the contents of the ms and sp fields related to the mass and intensity > # of the fragments, i.e. spectral information > Spectrum <- data.frame(ms = Library$ms[[1]], sp = Library$sp[[1]]) > head(spectrum) ms sp 1 52 7 2 53 32 3 54 32 4 55 89 5 56 294 6 59 993 > # plot the spectrum > plot(x = Spectrum$ms, y = Spectrum$sp, type = "h", lwd = 2, col = "blue", + xlab = "mass", ylab = "intensity", main = Library$compound[1]) At last, let us find out what is included in RItable: > # check the name of included fields > names(ritable) [1] "rt" "ri" > # check the first lines of RItable > head(ritable) 5

rt ri 1 7.82 800 2 9.26 900 3 10.65 1000 4 13.26 1200 5 15.61 1400 6 17.74 1600 Now we need to get the Targets from the provided target table and the library: > # get targets info using target table and provided library > Targets <- gettarget(method = "library", Library = Library, + target.table = target.table) > # check the fields of Targets > names(targets) [1] "rt" "ri" "compound" "ms" "sp" [6] "quantfrag" "sortedfrag" > # check the first lines of some fields > head(as.data.frame(targets[c("compound", "rt", "ri")])) compound rt ri 1 Analyte 6 6.388 700.8 2 Analyte 33 7.582 784.0 3 Analyte 91 9.151 893.1 4 Analyte 104 9.599 919.5 5 Analyte 120 9.966 947.7 6 Analyte 135 10.321 975.0 To find the corresponding peaks in the run, we can call getpeak function: > # get the peaks for this run corresponding to Targets > runpeaks <- getpeak(run = Run, Targets = Targets) > # check the length of runpeaks (number of targets) > length(runpeaks) [1] 1 > # check the fields for each peak > names(runpeaks[[1]][[1]]) [1] "rtapex" "intapex" "RI" "scoreapex" "scorearea" "area" [7] "EIC" "RT" "ms" "sp" "rt0" "ri0" [13] "compound" > # area of the EIC of the first target > runpeaks[[1]][[1]]$area 6

[1] 78504 129442 890934 172192 Following that, the extracted ion chromatogram (EIC) of the retrieved peaks can be visualized using ploteic function: > # plot the EIC of the first peak (target) on the list > ploteic(peakeic = runpeaks[[1]][[1]]) Analyte 6, Score_Apex = 0.92 Score_Area = 0.83 Fragment 170 3e+05 171 56 98 2e+05 Intensity 1e+05 0e+00 6.3 6.4 6.5 Time (min) However, the above is done without retention time calibration. To adjust for RI, first we call getri to create a function which can be used to calculate the RI given the retention time: > # create the RI calibration function > calibri <- getri(ritable) > # calculate the RI of an RT = 12.32min > calibri(12.32) [1] 1127.969 7

> # get the peaks for this run corresponding to Targets using RI calibration > runpeaksri <- getpeak(run = Run, Targets = Targets, calibri = calibri) It is informative to check the scores for the detected targets. This can be done by using a specific target in an individual run, or by finding the scores of all targets at once and looking at the histogram of the scores: > # find the similarity score of the found targets > Scores <- getpeakscore(runpeaks = runpeaks, plot = TRUE) > # check the value of scores > print(scores) [,1] [1,] 0.8916002 [2,] 0.7696184 [3,] 0.9144938 [4,] 0.9337329 [5,] 0.8696336 [6,] 0.9027617 [7,] 0.9229788 [8,] 0.5645287 [9,] 0.8589739 [10,] 0.9150041 [11,] 0.8148486 [12,] 0.8911238 [13,] 0.6301923 [14,] 0.8460670 [15,] 0.9582448 [16,] 0.8516795 [17,] 0.9453319 8

3 2 count 1 0 0.00 0.25 0.50 0.75 1.00 Scores To use the fragment selection function, i.e. optfrag, we can use the example background library bglib. This is recommended to be done before the experiment, as after running the experiment, it may not be possible to find an optimum choice among the set of monitored fragments. We can also check to see what is the difference between the default set of the fragments, and the ones selected by optfrag function: > # get the optimized version of the target list > opttargets <- optfrag(library = Library, target.table = target.table, + forceopt = TRUE) > # check the fragments of the first target > # the mass of fragments > Targets$ms[[1]] [1] 56 98 170 171 > # the intensity of fragments > Targets$sp[[1]] 9

[1] 152 148 1000 156 > # check them after optimization > # the mass of fragments > opttargets$ms[[1]] [1] 170 171 185 143 > # the intensity of fragments > opttargets$sp[[1]] [1] 1000 156 141 65 In the example above, the optfrag function is used directly. However, it is usually used within the gettarget function, where the user can set if optimization is desired. 4 Future Work Improved peak detection and more options for data visualization are the main aspects of the next version. A GUI, where users can import and export data, is also considered for in future versions. Finally, it is planned to add support for importing other types of raw data such as mzml and mzxml together with other mass spectral library formats rather than NIST MSL. 10