MS-FINDER tutorial. Last edited in Aug 10, 2018

Similar documents
MS-FINDER tutorial. Last edited in Sep. 10, 2018

MRMPROBS tutorial. Hiroshi Tsugawa RIKEN Center for Sustainable Resource Science

MRMPROBS tutorial. Hiroshi Tsugawa RIKEN Center for Sustainable Resource Science MRMPROBS screenshot

Agilent G6854 MassHunter Personal Pesticide Database

MassHunter Personal Compound Database and Library Manager for Forensic Toxicology

Pathway Analysis of Untargeted Metabolomics Data using the MS Peaks to Pathways Module

Skyline High Resolution Metabolomics (Draft)

Agilent G6825AA METLIN Personal Metabolite Database for MassHunter Workstation

Skyline Targeted MS/MS


Tutorial 7: Automated Peak Picking in Skyline

TraceFinder Analysis Quick Reference Guide

Agilent MassHunter Qualitative Data Analysis

Agilent G6854AA MassHunter Personal Compound Database

Protein Deconvolution Quick Start Guide

RMassBank: Run-through the Principles and Workflow in R

MSFragger Manual. (build )

LC-MS Data Pre-Processing. Xiuxia Du, Ph.D. Department of Bioinformatics and Genomics University of North Carolina at Charlotte

Data Processing for Small Molecules

Tutorial 2: Analysis of DIA/SWATH data in Skyline

De Novo Peptide Identification

Progenesis CoMet User Guide

Building Agilent GC/MSD Deconvolution Reporting Libraries for Any Application Technical Overview

MassBank for NORMAN. Achievements in 2011 and further proposed steps for identification of unknowns

Discussion on harmonisation potential and needs. Peter Haglund Umeå University, Sweden

Application Note. Abstract. Authors. Forensic Toxicology

Agilent MassHunter Workstation Software 7200 Accurate-Mass Quadrupole Time of Flight GC/MS

Curatr: a web application for creating, curating, and sharing a mass spectral library

Annotation of LC-MS metabolomics datasets by the metams package

NIST MS AUTOIMP feature looks for and finds secondary locator file.

Agilent G2721AA Spectrum Mill MS Proteomics Workbench Quick Start Guide

MassHunter Pesticides PCD or PCDL Quick Start Guide

SIMAT: GC-SIM-MS Analayis Tool

Agilent MassHunter Metabolite ID Software. Installation and Getting Started Guide

Statistical Process Control in Proteomics SProCoP

Panorama Sharing Skyline Documents

MALDI Software Users Guide (simplified version)

Package Metab. September 18, 2018

NORMAN MassBank and beyond Status and future of activities regarding mass spectral databases

An R package to process LC/MS metabolomic data: MAIT (Metabolite Automatic Identification Toolkit)

Xcalibur Library Browser

Skyline MS1 Full Scan Filtering

Skyline Targeted Method Editing

Progenesis CoMet User Guide

Skyline Targeted Method Refinement

Agilent Triple Quadrupole LC/MS Peptide Quantitation with Skyline

Skyline irt Retention Time Prediction

OpenLynx User's Guide

Corra v2.0 User s Guide

Skyline Targeted Method Refinement

Package CorrectOverloadedPeaks

Progenesis QI for proteomics User Guide. Analysis workflow guidelines for DDA data

Tutorials. Saturn GC/MS Workstation Version 5.4. Varian Analytical Instruments 2700 Mitchell Drive Walnut Creek, CA /usa

NMR Users Guide Organic Chemistry Laboratory

Mnova Training Basics

Xcalibur. QuickQuan. User Guide. XCALI Revision C July For Research Use Only Not for use in Diagnostic Procedures

TraceFinder Administrator Quick Reference Guide

PEAKS Studio 5.1 User s Manual

PEAKS Studio 5 User s Manual

Package HiResTEC. August 7, 2018

Retention Time Locking with the MSD Productivity ChemStation. Technical Overview. Introduction. When Should I Lock My Methods?

REVIEW. MassFinder 3. Navigation of the Chromatogram

You will remember from the introduction, that sequence queries are searches where mass information is combined with amino acid sequence or

Spectrometer Visible Light Spectrometer V4.4

GPS Explorer Software For Protein Identification Using the Applied Biosystems 4700 Proteomics Analyzer

Direct Infusion Mass Spectrometry Processing (DIMaSP) Instructions for use

Agilent G6855AA MassHunter Personal

To get started download the dataset, unzip files, start Maven, and follow steps below.

MassHunter File Reader

TopSpin. Multiplet Analysis Tutorial

TraceFinder Analysis Quick Reference Guide

This manual describes step-by-step instructions to perform basic operations for data analysis.

Agilent G6854AA MassHunter Personal Pesticide Database Kit Quick Start Guide

Fatty Acid Methyl Ester (FAME) RTL Databases for GC and GC/MS

Chromeleon / MSQ Plus Operator s Guide Document No Revision 03 October 2009

MetCirc: Navigating mass spectral similarity in high-resolution MS/MS metabolomics data

Reference Manual MarkerView Software Reference Manual Revision: February, 2010

Using OPUS to Process Evolved Gas Data (8/12/15 edits highlighted)

We are painfully aware that we don't have a good, introductory tutorial for Mascot on our web site. Its something that has come up in discussions

Analysis of GC-MS metabolomics data with metams

High-throughput Processing and Analysis of LC-MS Spectra

Steps for LCMS-8040 Triple Quadrupole

Tutorial for the PNNL Biodiversity Library Skyline Plugin

Note: Note: Input: Output: Hit:

QuiC 1.0 (Owens) User Manual

Spectronaut Pulsar X

A Batch Import Module for an Empirically Derived Mass Spectral Database

Project Report on. De novo Peptide Sequencing. Course: Math 574 Gaurav Kulkarni Washington State University

SpectroDive 8 - Coelacanth. User Manual

Agilent G2721AA/G2733AA Spectrum Mill MS Proteomics Workbench

Agilent 6400 Series Triple Quadrupole LC/MS System

An introduction to Baitmet package

NIST/EPA/NIH Mass Spectral Library (NIST 05) and NIST Mass Spectral Search Program (Version 2.0d)

PROTEOMIC COMMAND LINE SOLUTION. Linux User Guide December, B i. Bioinformatics Solutions Inc.

MultiGlycan ESI User Manual

SIVIC GUI Overview. SIVIC GUI Layout Overview

FrAnK: A Web Application for the Annotation of Mass Spectral Peaks Using Fragmentation Spectra

MIMA V 1.0. MS IMS Mapper. Peak Identification in Ion mobility Spectrometry

Agilent 6300 Ion Trap LC/MS Systems Quick Start Guide

User guide. Using the ALEX 123 framework to identify lipid molecules detected by MALDI-based high-resolution FTMS 1 and ITMS 2 analysis

Transcription:

MS-FINDER tutorial Last edited in Aug 10, 2018 Abstract The purpose of metabolomics is to perform the comprehensive analysis for small biomolecules of living organisms. Gas chromatography coupled with electron ionization mass spectrometer (GC/MS) and liquid chromatography coupled with electrospray ionization- (ESI-) tandem mass spectrometer (LC/MS/MS) are the preferred tools for untargeted metabolomics. Currently, the main bottleneck of GC/MS- and LC/MS/MS based untargeted analysis is compound identification due to the limitation of EI-MS and MS/MS records of authentic standard. MS-FINDER was launched as a universal program for compound annotation that supports EI-MS (GC/MS) and MS/MS spectral mining. First, MS-FINDER aims to provide solutions for 1) formula predictions, 2) fragment annotations, and 3) structure elucidations by means of unknown spectra. In addition, the program can annotate your unknowns by the public spectral databases such as MassBank, LipidBlast, and GNPS. MS-FINDER has been developed as the collaborative work between Prof. Masanori Arita team (RIKEN, Reifycs Inc.) and Prof. Oliver Fiehn team (UC Davis) supported by the JST/NSF SICORP Metabolomics for the low carbon society project. Hiroshi Tsugawa RIKEN Center for Sustainable Resource Science hiroshi.tsugawa@riken.jp MS-FINDER screenshot 1

Table of contents Software environments... 3 Required programs... 4 Acceptable ASCII formats... 5 MSP format for MS/MS... 6 MSP format for EI-MS... 7 MAT format... 8 Adduct ion format: [M+Na]+, [M+2H]2+, [M-2H2O+H]+, [2M+FA-H]-, etc.... 10 User defined structure database format... 11 Specific field to fix the formula element count... 12 Import queries... 13 A. From a folder which includes MSP or MAT format files... 13 B. From the graphical user interface of the MS-FINDER program... 14 C. From the MS-DIAL program... 15 Parameter setting... 16 Method tab... 16 Mass spectrum tab... 17 Formula finder tab... 18 Structure finder tab... 19 Data source tab... 21 Compound annotation by in silico fragmenter... 22 Compound annotation by searching spectral databases... 24 Compound annotation (batch analysis)... 25 Peak assignment (single)... 26 Peak assignment (batch job)... 27 Mouse function... 28 Export... 29 Help... 30 2

Software environments Windows OS (.NET Framework 4.5 or later): Windows 7 or later RAM: 8.0 GB or more 3

Required programs MS-FINDER Download link: http://prime.psc.riken.jp/metabolomics_software/ms-finder/index.html MS-FINDER can be used as the local software program in Windows PC. The program can import ASCII format files including MSP (EI-MS and MS/MS) or improved MSP (both MS and MS/MS, the file extension must be MAT.). In addition, the users can directly make the query in the MS-FINDER graphical user interface. Moreover, this program can be called from the MS- DIAL program which is downloadable at http://prime.psc.riken.jp/metabolomics_software/ms- DIAL/index.html. 4

Acceptable ASCII formats This program accepts two file extensions, i.e. MSP or MAT formatted by the following explanations. Unknown queries should be separately stored in the ASCII file: the MSP or MAP file CANNOT store multi compound records in the single file. The format of MSP basically follows the NIST MS search manual. Link: http://www.nist.gov/srd/upload/nist1a11ver2-0man.pdf 5

MSP format for MS/MS Required fields NAME: PRECURSORMZ: PRECURSORTYPE: IONMODE: (Positive or Negative) Num Peaks: m/z intensity pair (tab, comma, space can be used as the delimiter.) Mass spectrum is supposed to be imported as MS/MS. If you want to perform the MS/MS peak annotation with the known structure, prepare two fields including FORMULA and SMILES. Please note that the formula and SMILES of the neutralized structure should be prepared. MSP example 6

MSP format for EI-MS Required fields NAME: IONMODE: (Positive or Negative) Num Peaks: m/z intensity pair (tab, comma, space can be used as the delimiter.) The fields are the minimum requirement for searching spectral databases. In the case that you want to perform formula predictions and structure elucidations in EI-MS data, two files PRECURSORMZ: and PRECURSORTYPE: must be required. 7

MAT format The MAT format was defined as the improved version of MSP in the MS-FIDNER program to store both MS1 and MS/MS spectra in the same file. The survey scan MS data should be required to calculate isotopic ion score for formula predictions. Importantly, for EI-MS spectra, put your spectra into both MS1- and MS2 fields for the calculation of isotopic ratio and fragment ion similarities, respectively. Required fields NAME: PRECURSORMZ: PRECURSORTYPE: IONMODE: MSTYPE: Num Peaks: m/z intensity pair (tab, comma, space can be used as the delimiter.) Three fields including MSTYPE, Num Peaks, and m/z intensity pair should be SERIALLY stored. If you type MSTYPE: MS1, the spectrum written from next field should be recognized as the survey scan MS (MS1). If you type MSTYPE: MS2, next spectrum should be recognized as the MS/MS spectrum. EI-MS spectra must be stored in both MS1 and MS2, which is the requirement of MS-FINDER. Both field (MSTYPE: MS1 and MSTYPE: MS2) is not necessary for this program, i.e. the users can import the ASCII file as only MS1 spectrum or as only MS/MS spectrum record. Users may prepare the MAT or MSP files without any spectrum record. In such case, the formula prediction will be performed by means of mass accuracy and database criteria. If you want to perform the MS/MS peak annotation with the known structure, prepare two fields including FORMULA and SMILES. The formula and SMILES of the neutralized structure should be made. 8

MAT example 9

Adduct ion format: [M+Na]+, [M+2H]2+, [M-2H2O+H]+, [2M+FA-H]-, etc. 1. The parentheses [ and ] must be used to bracket the ion information. 2. The char + and - must be required after ']' and the number must be written before + or -. 3. When you want to define the organic formula like C6H12O5, you have to write it without any replicate elements or parentheses. For example, the descriptions like [M+C2H5COOH-H]- or [M+H+(CH3)3SiOH]+ are not accepted. 4. The beginning figure of organic formula like '2'H2O is recognized as the H2O 2. Again, never use 2(H2O) for that. 5. Sequential equations are acceptable: [2M+H-C6H12O5+Na]2+ (very apt.) 6. Radical ion can be described by.(dot) after + or like [M]+. And [M-CH3]+. as adduct format. 7. MS-FINDER accepts some abbreviations or common organic formulas for adduct types as follows. For Acetonitrile: ACN, CH3CN For Methanol: CH3OH For Isopropanol: IsoProp, C3H7OH For Dimethyl sulfoxide: DMSO For Formic acid: FA, HCOOH For Acetic acid: Hac, CH3COOH For Trifluoroacetic acid: TFA, CFCOOH 10

User defined structure database format MS-FINDER supports the structure elucidations from the candidates that users provide. The following format file should be prepared as tab-delimited text file. The identifiers of InChIKey, short InChIKey, and database ID are not required, but the values must be filled by some mimic values. The files of exact mass, formula, and SMILES must be prepared. 11

Specific field to fix the formula element count MS-FINDER recently accepts the filtering field for molecular formula prediction. For example, if you can perform fully labeled stable isotope experiment, the element count for CHNOS can be determined by checking the mass shift between non-labeled and labeled sample data. The fields described in the below figure can be written in MSP and MAT files. 12

Import queries There are two ways to import unknown queries. A. From a folder which includes MSP or MAT format files 1. File -> import 2. Select a folder containing MSP or MAT files. 13

B. From the graphical user interface of the MS-FINDER program 1. File -> Create a query 2. Fill in the form to make a query. Required files Folder path File name Precursor m/z 14

C. From the MS-DIAL program The MS-DIAL program which has been reported as software for data processing of LC/MS/MS can call the MS-FINDER program directly. On the first time when you call the MS-FINDER program at MS-DIAL, please select the file path of MS-FINDER via GUI. 15

Parameter setting Method tab MS-FINDER provides two options for compound annotation: one is by spectral databases, and the other is by formula- and structure finder programs using in silico fragmenter. You can simultaneously check both spectral database search and formula prediction and structure elucidation by in silico fragmenter options, and the result of spectral database search has priority for ranking structures. The internal experimental library is stored in EIMS-DBs-vs*.egm and MSMS-DBs-vs*.etm of Resources folder as NIST MSP format. If Formula finder > TMS-MeOX derivative compound is checked, EIMS database will be used; otherwise, MSMS database is used. The in silico library for lipids (LipidBlast) is stored in MSDIAL-LipidDBs-vs*.lbm of Resources folder. Select the appropriate solvent condition for searching your unknowns. The user-defined spectral database must be formatted by NIST MSP. If Precursor oriented spectral search is checked, the structure candidates will be filtered out by the precursor m/z of spectral records in combination with MS1 tolerance value; otherwise, all of spectral records will be used. Uncheck this option if you want to search EI spectral databases. 16

Mass spectrum tab Mass tolerance (MS1): the mass tolerance to generate formula candidates. Mass tolerance (MS/MS): the mass tolerance for matching experimental- and reference fragments. Relative abundance cut off: The product ions more than this parameter on the basis of base peak ion are utilized for the product ion matching. *For EI-MS spectra, set the same tolerance into MS1 and MS2. 17

Formula finder tab Formula calculation setting: You can set the parameters for formula calculation. LOWIS and SENIOR check: to generate formula candidates that match the valence rules of formula elements. The valences of hetero atoms, i.e. N, O, S, and P are currently set to 3, 2, 6, and 5, respectively. Isotopic ratio tolerance: to calculate the isotopic score. The tolerance should be utilized as the sigma value for the Gaussian scoring as described in the MS-FINDER paper. Element ratio check: to generate formula candidates that satisfy every element ratios (ex. H/C ratio should be between 0 and 3.33 for Common range (99.7%) restriction. ) as described in the MS- FINDER paper. Element probability check: to generate formula candidates that satisfy the heuristic rules as described in the Seven Golden Rules paper. For example, if a formula candidate contains the following element counts, i.e. NOPS all > 1, the element counts of N, O, P, and S should be less than 9, 19, 3, and 2, respectively. Element selection: to generate formula candidates that just contain the elements selected by the users. Check TMS-MEOX derivative compound if you want to annotate EI-MS spectra. Result cut off: formula candidates ranked by the MS-FINDER program will be reported within up to this number. 18

Structure finder tab Here is the parameter setting for in silico fragmenter. Tree depth: the limitation of in silico cleavages, i.e. if the user sets 2, the MS-FINDER program generates the fragments until product ions of a product ion. The current MS-FINDER program can utilize the fragment ion library for EI-MS spectral mining which is stored in *.eif (recommended to use). Result cut off: structure candidates ranked by the MS-FINDER program will be reported within up to this number. XLogP based RT prediction and cut off setting for structure elucidation Recently, MS-FINDER provides a simple RT prediction function using XLogP calculated by CDK. If you prepare the tab-delimited text format file containing (first column) metabolite name, (second column) retention time (min), and (third column) SMILES code as described below, the predicted retention time is calculated for searching structure candidates. Retention time setting for spectral searching This function is for spectral searching. If your MSP or MAT files contain the retention time or retention index information, you can use the RT or RI filtering by means of this checkbox. 19

20

Data source tab Local databases: currently, total 14 metabolome databases are prepared in the MS-FINDER program which is stored in *.esd. The local databases selected by users will be used to retrieve the structure data. Please see user defined database format section for searching your own structure candidates. MINEs (Metabolic In silico Network Expansions) and PubChem online settings: If the user selects Never use it, the structure candidates will be picked up just from the local databases. If only use when there is no query in the below DBs is selected, the structure data of MINE and PubChem compound databases will be retrieved when no structures can be found in the local databases. If always use it. is selected, the MS-FINDER program always retrieve the structure data from MINE and PubChem databases in addition to local databases. 21

Compound annotation by in silico fragmenter The general workflow of MS-FINDER is described here. (Batch analysis is shown below) 1. The formula prediction is executed by double click at the title name of File navigator. The detail of formula calculations is described in our paper. *The point of formula prediction is to correctly select the precursor type (adduct ion) and to set parameters for picking up the correct formulas. A) The formula candidate checked in Select column is supposed to be examined at structure finder program. B) The isotopic ions will be displayed by I button at the upper mass spectrum window. C) The result of formula assignment in product ions will be displayed by P button at the bottom spectrum window. D) The annotation result of neutral losses will be displayed by N button at bottom window. 22

2. The structure finder will be executed by right click at the formula result table followed by clicking Search the structure. A) The formula candidate checked in Select column is supposed to be examined at structure finder program. B) Total score of structure finder is the total of formula- and structure scores. Therefore, even if a formula score is greater than the others, the other structure from another formula candidate may become the top candidate in the structure finder program. C) The MS-FINDER program integrates the structures having the same molecular skeleton by its InChIKey (first 14 characters). Its representative structure will be determined by the number of synonymous in PubChem repository. 23

Compound annotation by searching spectral databases It s very simple. Double click the unknown record that you want to annotate. The below is the examples for searching EI-MS spectral database and LipidBlast database. 24

Compound annotation (batch analysis) 1. Analysis -> Compound annotation (batch job) A) If you want to perform the batch job for both formula predictions and structure elucidations, please check Both processes. Also, add the number in Top N hits textbox where the formula candidates generated by the formula finder program are supposed to be searched. B) If you want to perform the batch job for formula predictions, please just check molecular formula finder. C) If you want to perform the structure finder program, please just check structure finder. Here, the formula candidates checked by the users are supposed to be examined. Also, if the formula finder is not executed before this analysis, the queries will be passed. 25

Peak assignment (single) The MS-FINDER program can be used as the peak assignment tool to assign substructures in the MS/MS spectrum from user-defined structure. 1. Analysis -> Peak assignment (single) 2. Select the query file that you want to analyze. 3. Add both formula and SMILES into the textboxes as the neutralized form. 4. The result will be generated as shown below. 26

Peak assignment (batch job) Analysis-> Peak assignment (batch job) To use this program, please make sure that the record in MSP or MAT files should contains the respective FORMULA and SMILES fields. Otherwise, the program will be passed for records not having their fields. The SMILES or FORMULA records can be added in File information textboxes of MS-FINDER GUI. 27

Mouse function A) Mouse right click (or hold) and move: zoom in and out B) Mouse left click (or hold) and move: select and scroll C) Mouse left double click: reset range and select files in the file navigator D) Mouse wheel: zoom in and out E) Right click: popup context menu 28

Export The result of formula and structure finders can be exported from this option. Currently, the top 10 candidates will be automatically exported. However, you can also check the details of result as the ASCII file. It means that the MS-FINDER program is supposed to generate the FGT file containing formula results in the same directory as the project folder. Also, the program generates the respective folder containing the SFD file (per formula candidate) which stores the result of structure finder program. 29

Help You can check the version of MS-FINDER. 30