NMRProcFlow Macro-command Reference Guide

Similar documents
NMR TOOLS. Pre-processing. Manon Martin Marie Tremblay-Franco Cécile Canlet 31/05/2017 v 1.0.0

Package mqtl. May 25, 2018

Metabolomic Data Analysis with MetaboAnalyst

Improved Centroid Peak Detection and Mass Accuracy using a Novel, Fast Data Reconstruction Method

Package batman. June 19, 2016

Preprocessing: Smoothing and Derivatives

NMRProcFlow Installation Guide

SUPPLEMENTARY INFORMATION

Preprocessing, Management, and Analysis of Mass Spectrometry Proteomics Data

Package PepsNMRData. R topics documented: November 1, Type Package Title Datasets for the PepsNMR package Version 1.1.0

HIFI NMR : part1 automated backbone assignments using 3D->2D

Experiment 5: Exploring Resolution, Signal, and Noise using an FTIR CH3400: Instrumental Analysis, Plymouth State University, Fall 2013

UV WinLab Software The Lambda Series

Frequently Asked Questions (FAQ)

Goals of tutorial. Showcase NMRbox with NUS tools A dozen different NUS processing tools installed and configured more coming.

Data processing. Filters and normalisation. Mélanie Pétéra W4M Core Team 31/05/2017 v 1.0.0

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska

Data Preprocessing. D.N. Rutledge, AgroParisTech

QuantWiz: A Parallel Software Package for LC-MS-based Label-free Protein Quantification

KJM Proton T1 Spectra on the AVI-600 and AVII-600. Version 1.0

Processing data with Bruker TopSpin

3-D Gradient Shimming

MALDIquant: Quantitative Analysis of Mass Spectrometry Data

Package batman. Installation and Testing

Goals of tutorial. Introduce NMRbox platform

Deliverable A comprehensive and standardised e-infrastructure for analysing medical metabolic phenotype data. 1st September 2015

A Bayesian approach for the alignment of high-resolution NMR spectra

Getting Your Peaks in Line: A Review of Alignment Methods for NMR Spectral Data

Analysis of Process and biological data using support vector machines

Filtering Images. Contents

NMR Spectroscopy with VnmrJ. University of Toronto, Department of Chemistry

Agilent MicroLab Quant Calibration Software: Measure Oil in Water using Method IP 426

Reference Manual MarkerView Software Reference Manual Revision: February, 2010

Proton dose calculation algorithms and configuration data

Mass Spec Data Post-Processing Software. ClinProTools. Wayne Xu, Ph.D. Supercomputing Institute Phone: Help:

CRL MASS SPECTROMETRY FACILITY USER MANUAL. A Guide to Oligonucleotide Analysis by MALDI TOF/TOF

WEINER FILTER AND SUB-BLOCK DECOMPOSITION BASED IMAGE RESTORATION FOR MEDICAL APPLICATIONS

Advance XCMS Data Processing. H. Paul Benton

Cluster Analysis. Ying Shen, SSE, Tongji University

NMRProcFlow Installation Guide

Statut actuel de NUS et APSY

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample

CSImage Tutorial v August 2000

PROCESSING 2D SPECTRA USING VNMRJ JB Stothers NMR Facility Materials Science Addition 0216 Department of Chemistry Western University

Application of Clustering Techniques to Energy Data to Enhance Analysts Productivity

NMR Users Guide Organic Chemistry Laboratory

UNIT 2 Data Preprocessing

NIR Technologies Inc. FT-NIR Technical Note TN 005

Relationship between Fourier Space and Image Space. Academic Resource Center

QUANTAX EDS SYSTEM SOP

Rowland NMR Toolkit (RNMRTK) Overview

User Manual STATISTICS

NUS: non-uniform sampling

R-software multims-toolbox. (User Guide)

From FID to 2D: Processing HSQC Data Using NMRPipe Macro Methods

Analyzing Images Containing Multiple Sparse Patterns with Neural Networks

An Algorithm to Determine the Chromaticity Under Non-uniform Illuminant

MS data processing. Filtering and correcting data. W4M Core Team. 22/09/2015 v 1.0.0

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

c. Saturday and Sunday: 10 minute time blocks and unlimited time.

Fourier Transformation Methods in the Field of Gamma Spectrometry

TopSpin is the successor to a program called XWinNMR, which appeared in the mid- 90s (I believe) and was used to run Bruker spectrometers until it

EECS 556 Image Processing W 09. Image enhancement. Smoothing and noise removal Sharpening filters

Project Report on. De novo Peptide Sequencing. Course: Math 574 Gaurav Kulkarni Washington State University

Instruction manual for T3DS calculator software. Analyzer for terahertz spectra and imaging data. Release 2.4

Improvement of SURF Feature Image Registration Algorithm Based on Cluster Analysis

Agilent G6854 MassHunter Personal Pesticide Database

Package ASICS. January 23, 2018

An Accurate Method for Skew Determination in Document Images

OTO Photonics. Sidewinder TM Series Datasheet. Description

Diffusion Ordered Spectroscopy (DOSY) Overview BRUKER last edit 5/14/12

Machine Learning. Unsupervised Learning. Manfred Huber

Online algorithms for clustering problems

COMP 465: Data Mining Still More on Clustering

Coarse-to-fine image registration

The latest trend of hybrid instrumentation

Package ALS. August 3, 2015

Basic 1D Processing. Opening Saved Data. Start the VnmrJ software. Click File > Open.

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors

OtO Photonics. Sidewinder TM Series Datasheet. Description

WAVELET USE FOR IMAGE RESTORATION

Unsupervised learning in Vision

Package Metab. September 18, 2018

XRDUG Seminar III Edward Laitila 3/1/2009

Real converters, parsers & validators for NMR-ML. Standards Development. WP leader: Steffen Neumann IPB

2D Image Processing Feature Descriptors

MRMPROBS tutorial. Hiroshi Tsugawa RIKEN Center for Sustainable Resource Science

Multivariate Calibration Quick Guide

Annotation of LC-MS metabolomics datasets by the metams package

Cluster Analysis (b) Lijun Zhang

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13.

Data Preprocessing. Slides by: Shree Jaswal

MestReC Cheat Sheet. by Monika Ivancic, July 1 st 2005

MassHunter Personal Compound Database and Library Manager for Forensic Toxicology

Norbert Schuff VA Medical Center and UCSF

DIGITAL IMAGE ANALYSIS. Image Classification: Object-based Classification

Automated Baseline Estimation for Analytical Signals

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

NMR INSTRUMENT INSTRUCTIONS: Safety and Sample Preparation

Methodology for spot quality evaluation

Transcription:

NMRProcFlow Macro-command Reference Guide This document is the reference guide of the macro-commands Daniel Jacob UMR 1332 BFP, Metabolomics Facility CGFB Bordeaux, MetaboHUB - 2018 1

NMRProcFlow - Macro-command Reference Guide Content Pre-processing... 3 airpls... 5 align... 6 bucket... 7 clupa... 9 gbaseline... 10 normalisation... 11 shift... 13 zero... 14 2

Pre-processing The spectral preprocessing for 1D NMR should be applied in case the input raw data are FID, either from a Bruker or Varian spectrometer. The term pre-processing designates here the transformation of the NMR spectrum from time domain to frequency domain, including the phase correction and the fast Fourier-transform (FFT). #%% Vendor=string; Type=string; LB=numeric; ZF=0 2 4; PHC1=TRUE FALSE; ZNEG=TRUE FALSE; TSP=TRUE FALSE; #%% This character sequence plays the role of a shebang at the beginning of the preprocessing command line. The different parameters must be written in the same line, and all of them based on the same model variable=value, and each of them separated by a semicolon. Vendor Type LB ZF PHC1 ZNEG TSP Vendor or nmrml - value must be taken in the set: bruker, varian, nmrml. Type of the raw data - value must be taken in the set: fid, 1r. In case of the type is 1r, the preprocessing is not applied, and the others parameters are ignored. Line Broadening Typical value is 0.3 (only if type=fid) Zero Filling factor (i.e the factor of multiplication of the FID) indicates if the first order phase correction id applied Zeroing Negative Values indicates whether a 'zero reference signal' is present or not (i.e. TSP, TMS, DSS,...) 3

Vendor=nmrml; Type=fid; LB=0.3; ZF=4; PHC1=TRUE; ZNEG=FALSE; TSP=TRUE; References James Keeler (2011) Understanding NMR Spectroscopy, 2nd Edition, Ed John Wiley & Sons Ltd, ISBN: 978-0-470-74608-0 4

airpls The airpls (adaptive iteratively reweighted penalized least squares) algorithm based on [Zhang et al., 2010] is a baseline correction algorithm that works completely on its own, and that only requires a detail parameter for the algorithm, called Lambda. Because this Lambda parameter can vary within a very large range (from 10 up to 1.e+06), we converted this parameter within a more convenient scale for the user, called 'level correction factor' chosen by the user from '1' (soft) up to '6' (high). The lower this level correction factor is set, the smoother baseline will be. Conversely, the higher this level correction factor is set, the more baseline will be corrected in details. airpls <ppm range> <lambda> <ppm range> <lambda> The ppm zone to apply the baseline correction. The detail parameter the true detail parameter used in the algorithm is rescaled according to the formula : 10^(6-lambda) airpls 0.674 1.601 3 References Zhang Z, Chen S, and Liang Y-Z (2010) Baseline correction using adaptive iteratively reweighted penalized least squares, Analyst, 2010,135, 1138-1146. doi:10.1039/b922045c 5

align A peak alignment algorithm based on Least-squares. Align spectra either based on a particular spectrum chosen within the spectra set, or based on the average spectrum. In this latter case, the re-alignment procedure is executed three times, the average spectrum being recalculated at each time. In order to limit the relative ppm shift between the spectra to be realigned and the reference spectrum, the user can set this limit by adjusting the parameter 'Relative shift max. that sets the maximum shift between spectra and the reference. The range goes from 0 (no ppm shift allowed) up to 1 (maximum ppm shift equal to 100% of the selected ppm range). align <ppm range> < relative shift max> <groupid> <ppm range> < relative shift max > <groupid> The ppm zone to apply the baseline correction. sets the maximum shift between spectra and the reference The group id of the spectra to be aligned. In batch mode, only the 0 value is considered (i.e all values) align 4.241 4.687 0.11 0 6

bucket An NMR spectrum may contain several thousands of points, and therefore of variables. In order to reduce the data dimensionality binning or bucketing is commonly used. In binning, the spectra are divided into bins (so called buckets) and the total area (sum of each resonance intensity) within each bin is calculated to represent the original spectrum. The more simple approach (Uniform bucketing) consists in dividing all spectra with uniform spectral width (typically 0.01 to 0.04 ppm). Due to the arbitrary division of spectrum, one bin may contain pieces from two or more resonances and one resonance may appear in two bins which may affect the data analysis. We have chosen to implement the Adaptive, Intelligent Binning method [De Meyer et al. 2008] that attempts to split the spectra so that each area common to all spectra contains the same single resonance, i.e. belonging to the same metabolite (Intelligent bucketing). In such methods, the width of each bucket is then determined by the maximum difference of chemical shifts among all spectra. Finally, the bucketing type called Variable-Size Bucketing allows to simply provide a list of static bucket zones without any parameter. Or bucket <binning type> <ppm noise> <resolution> <snr> <ppm range 1> <ppm rang 2> EOF bucket vsb <ppm range 1> <ppm rang 2> EOF <binning type> The binning method can be either unif for uniform bucketing, aibin for intelligent bucketing, or vsb for variable-size bucketing 7

<ppm noise> <resolution> <snr> <ppm range n> Only for 'aibin' & 'unif': The ppm range considered as noisy zone. Indeed the algorithm needs to estimate the noise level to be more efficient. Only for 'aibin' & 'unif': The resolution parameter: typically 0.01 to 0.04 ppm for the uniform bucketing, and from 0.1 to 0.6 for the intelligent bucketing. Only for 'aibin' & 'unif': The Signal-Noise Ratio defines the minimal intensity levels to be taken into account for the binning the ppm zones to apply the binning bucket aibin 10.2 10.5 0.5 5 0.6 4.72 4.865 5.5 6.3 9.5 EOL bucket vsb 5.431 5.406 5.249 5.225 4.669 4.631 EOL References De Meyer, T., Sinnaeve, D., Van Gasse, B., Tsiporkova, E., Rietzschel, E. R., De Buyzere, M. L., et al. (2008). NMR-based characterization of metabolic alterations in hypertension using an adaptive, intelligent binning algorithm. Analytical Chemistry, 80(10), 3783 3790.doi: 10.1021/ac7025964 8

clupa A peak alignment algorithm, called hierarchical Cluster-based Peak Alignment (CluPA), aligns a target spectrum to the reference spectrum in a top-down fashion by building a hierarchical cluster tree from peak lists of reference and target spectra and then dividing the spectra into smaller segments based on the most distant clusters of the tree. clupa <ppm noise> <ppm range> <resolution> <snr> <groupid> <ppm noise> <ppm range> <resolution> <snr> <groupid> The ppm range considered as noisy zone. Indeed the algorithm needs to estimate the noise level to be more efficient. The ppm zone to apply the baseline correction. The resolution parameter given a ppm interval. It defines the smallest ppm interval to be aligned. The Signal-Noise Ratio defines the minimal intensity levels to be taken into account during the peak finding step The group id of the spectra to be aligned. In batch mode, only the 0 value is considered (i.e all values) clupa 10.2 10.5 4.241 4.687 0.01 5 0 References Vu T.N., Valkenborg D., Smets K., Verwaest K.A., Dommisse R., Lemière F., Verschoren A., Goethals B., Laukens K.( 2011) An integrated workflow for robust alignment and simplified quantitative analysis of NMR spectrometry data. BMC Bioinformatics.;12:405 doi: 10.1186/1471-2105-12-405 9

gbaseline The global baseline correction was based on [Bao et al., 2012], but only two phases were implemented: i) Continuous Wavelet Transform (CWT), and ii) the sliding window algorithm. The user must choose the correction level, from 'soft' up to 'high'. gbaseline <ppm noise> <ppm range> <window size> <smoothing size> <ppm noise> <ppm range> <window size> <smoothing size> The ppm range considered as noisy zone. Indeed the algorithm needs to estimate the noise level to be more efficient. The ppm zone to apply the baseline correction. The width (number points) of the convolution window used for the baseline estimation. The width (number points) of the convolution window used for the smoothed spectrum calculation gbaseline 10.2 10.5-0.5 11 75 50 References Bao, Q., Feng, J., Chen, F., Mao, W., Liu, Z., Liu, K., et al. (2012).A new automatic baseline correction method based on iterative method. Journal of Magnetic Resonance, 218, 35 43. doi: 10.1016/j.jmr.2012.03.010. 10

normalisation In order to make all spectra comparable with each other, the variations of the overall concentrations of samples have to be taken into account. We propose two normalization methods. In NMR metabolomics, the total intensity normalization (called the Constant Sum Normalization) is often used so that all spectra correspond to the same overall concentration. It simply consists in normalizing the total intensity of each individual spectrum to a same value. The Probabilistic Quotient Normalization [Dieterle et al. 2006] assume that biologically interesting concentration changes influence only parts of the NMR spectrum, while dilution effects will affect all metabolite signals. Probabilistic Quotient Normalization (PQN) starts by the calculation of a reference spectrum based on the median spectrum. Next, for each variable of interest the quotient of a given test spectrum and reference spectrum is calculated and the median of all quotients is estimated. Finally, all variables of the test spectrum are divided by the median quotient. normalisation <method> <ppm range 1> <ppm rang 2> EOF <method> <ppm range n> The normalization method can be either CSN for Constant Sum Normalization or PQN for Probabilistic Quotient Normalization. the ppm zones to apply the normalization 11

normalisation PQN 5.089 9.547 0.651 4.68 EOL References Dieterle F., Ross A., Schlotterbeck G. and Senn H. (2006). Probabilistic Quotient Normalization as Robust Method to Account for Dilution of Complex Biological Mixtures. Application in 1H NMR Metabonomics. Analytical Chemistry, 78:4281-4290.doi:10.1021/ac051632c Kohl SM, Klein MS, Hochrein J, Oefner PJ, Spang R, Gronwald W. (2012) State-of-the art data normalization methods improve NMR-based metabolomic analysis, Metabolomics 146-160, doi: 10.1007/s11306-011- 0350-z 12

shift To effectively align certain delicate ppm zones, it is possible to shift beforehand these zones on one or more spectra (corresponding to the same factor level) in order to align afterwards these ppm zones including all spectra. shift <ppm range> < ppm shift value> <groupid> <ppm range> <ppm shift> <groupid> The ppm zone to apply the shift. The ppm shift value applied on the ppm zone The group id of the spectra to be aligned. In batch mode, only the 0 value is considered (i.e all values) shift 4.241 4.687 0.01 2 3 13

zero Put to zero all the ppm ranges given within a list zero <ppm range 1> <ppm rang 2> EOF <ppm range n> The ppm zones to apply the zeroing. zero 4.705 4.881 5.484 6.333 EOL 14