Metabolic Profiling

Metabolic profiling offers a powerful window into the dynamic interaction between an organism’s genetic makeup and environmental influences, by assaying the metabolic content of biofluids (Nicholson et al [1]). By measuring levels of the products of metabolism, metabolic profiling can capture a real-time picture of an organism’s metabolic state, including both genetic factors and non-genetic factors such as environmental, nutritional, and behavioural influences (Holmes et al [2]).

The two most common analytic technologies for metabolic profiling are Nuclear Magnetic Resonance (NMR) spectroscopy (Dona et al [3]), typically directed at the proton spectrum, and hyphenated-mass spectrometry (MS), combining a chromatographic separation with mass-spectrometric detection (Lewis et al [4]).

NMR spectroscopy provides a highly precise, non-destructive analytical technique, but is hampered by a comparatively low sensitivity. Applied to biofluids for the purpose of profiling, NMR typically yields data as a one-dimensional spectrum, which may be analysed in an untargeted profiling fashion, or further processed to extract lists of quantified compounds from the spectrum and thus treat the data in a targeted manner (Hao et al [5], Ravanbakhsh et al [6]).

Mass spectrometry offers a highly sensitive tool for measuring compounds in biofluids, but is limited by the need to resolve compounds according to molecular weight, as many compounds commonly observed in biofluids share the same chemical formula. This is typically accounted for by coupling MS detection with chromatography (for example, liquid chromatography, LC-MS) in order to further separate molecules by their chromatographic affinity. Such hyphenated methods result in a two dimensional dataset for each sample analysed (mass to charge ratio, m/z vs. chromatographic affinity, typically as retention time.

Owing to the complexity and volume of LC-MS data, usually a preliminary feature detection step is applied, that reduces the two dimensional raw analytical data to a 1D list of detected features, each of which is characterised by abundance and observed m/z and retention time. This process may be conducted in both a targeted manner, where the peaks to be integrated are defined in advance, or in an untargeted profiling approach, in which all peak-like features detectable in the data are integrated. There are a wide range of peak-detection algorithms (Spicer et al [7]), but all are susceptible to spuriously detecting analytical noise as features, and thus require a stage of de-noising to produce a high final quality dataset.

Both analytical platforms mentioned above are subject to analytical biases and variances in the precision and accuracy of measurements, and this must be accounted for by the inclusion of quality control (QC) measures, in the form of stand alone QC samples, and reference compounds that may be doped into the samples. A well calibrated NMR instrument is expected to have excellent precision, and study specific QC measures are typically limited to the doping of a chemical shift reference and the repeated analysis of a reference sample. Owing to the complex interactions between sample and instrument, however, LC-MS assays, typically show lower measurement precision than NMR, and often exhibit longitudinal signal drifts over the course of an analysis which must be corrected in order to obtain an accurate representation of true levels in each sample.

As alluded to above, regardless of the analytical platforms used to generate measurements, metabolic profiling assays can be broadly separated into two classes: targeted and untargeted profiling analysis. In targeted analyses, the list of compounds to be detected is often defined up-front and the measurements frequently given as absolute quantifications. The data pre-processing strategy is also targeted, as it focuses on the extraction and integration of an expected set of signals. Conversely, in an profiling analysis, the set of compounds measured is expanded to capture as many compounds as possible. Although this approach in theory provides a more complete window into metabolism, the chemical identity of the great majority of detected compounds in the assay will be unknown and, owing to challenges in feature detection, for some features, the measurement precision might be inferior compared to that in a targeted assay.

Targeted and profiling assay each carry their own implied trade-offs in terms of measurement precision, metabolite annotation and quality control strategy. Protocols for the conduct of targeted analyses are well-established (see section ref{targeted}). Quality control in profiling studies has typically been conducted on an ad-hoc basis for individual studies, although in recent years there is an increasing push towards the systematisation and automation of pre-processing toolkits (Giacomoni et al [8], Rijswijk et al [9]). The nPYc-Toolbox, presented here, is intended to provide a platform for quality control of metabolic profiling datasets, embodying the quality control practices championed by the MRC-NIHR National Phenome Centre, and focusing on the interpretability of the output to both the analysts who generate the data, and the final users who will perform statistical analysis.

[1]Jeremy K Nicholson, John Connelly, John C Lindon and Elaine Holmes. Metabonomics: a platform for studying drug toxicity and gene function. Nature Reviews Drug Discovery, 1(2):153-61, 2002. URL:
[2]Elaine Holmes, Ruey Leng Loo, Jeremiah Stamler, Magda Bictash, Ivan KS Yap, Queenie Chan, Timothy MD Ebbels, Maria De Iorio, Ian J Brown, Kirill A Veselkov, Martha L Daviglus, Hugo Kesteloot, Hirotsugu Ueshima, Liancheng Zhao, Jeremy K Nicholson and Paul Elliott. Human metabolic phenotype diversity and its association with diet and blood pressure. Nature, 453(7193):396-400, 2008. URL:
[3]Anthony C Dona, Beatriz Jiménez, Hartmut Schäfer, Eberhard Humpfer, Manfred Spraul, Matthew R Lewis, Jake TM Pearce, Elaine Holmes, John C Lindon and Jeremy K Nicholson. Precision High-Throughput Proton NMR Spectroscopy of Human Urine, Serum, and Plasma for Large-Scale Metabolic Phenotyping. Analytical Chemistry, 86(19):9887-9894, 2014. URL:
[4]Matthew R Lewis, Jake TM Pearce, Konstantina Spagou, Martin Green, Anthony C Dona, Ada HY Yuen, Mark David, David J Berry, Katie Chappell, Verena Horneffer-van der Sluis, Rachel Shaw, Simon Lovestone, Paul Elliott, John Shockcor, John C Lindon, Olivier Cloarec, Zoltan Takats, Elaine Holmes and Jeremy K Nicholson. Development and Application of Ultra-Performance Liquid Chromatography-TOF MS for Precision Large Scale Urinary Metabolic Phenotyping. Analytical Chemistry, 88(18):9004-9013, 2016. URL:
[5]Jie Hao, Manuel Liebeke, William Astle, Maria De Iorio, Jacob G Bundy and Timothy MD Ebbels. Bayesian deconvolution and quantification of metabolites in complex 1D NMR spectra using BATMAN. Nature Protocols, 9(6):1416–1427, 2014. URL:
[6]Siamak Ravanbakhsh, Philip Liu, Trent C Bjorndahl, Rupasri Mandal, Jason R Grant, Michael Wilson, Roman Eisner, Igor Sinelnikov, Xiaoyu Hu, Claudio Luchinat, Russell Greiner and David S Wishart. Accurate, Fully-Automated NMR Spectral Profiling for Metabolomics. PLOS ONE, 10(5):1-15, 2015. URL:
[7]Rachel Spicer, Reza M Salek, Pablo Moreno, Daniel Cañueto and Christoph Steinbeck. Navigating freely-available software tools for metabolomics analysis. Metabolomics, 13(9):106, 2017. URL:
[8]Franck Giacomoni, Gildas Le Corguillé, Misharl Monsoor, Marion Landi, Pierre Pericard, Mélanie Pétéra, Christophe Duperier, Marie Tremblay-Franco, Jean-François Martin, Daniel Jacob, Sophie Goulitquer, Etienne A Thévenot and Christophe Caron. Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics. Bioinformatics, 31(9):1493–1495, 2015. URL:
[9]Merlijn van Rijswijk, Charlie Beirnaert, Christophe Caron, Marta Cascante, Victoria Dominguez, Warwick B Dunn, Timothy MD Ebbels, Franck Giacomoni, Alejandra Gonzalez-Beltran, Thomas Hankemeier, Kenneth Haug, Jose L Izquierdo-Garcia, Rafael C Jimenez, Fabien Jourdan, Namrata Kale, Maria I Klapa, Oliver Kohlbacher, Kairi Koort, Kim Kultima, Gildas Le Corguillé, Pablo Moreno, Nicholas K Moschonas, Steffen Neumann, Claire O’Donovan, Martin Reczko, Philippe Rocca-Serra, Antonio Rosato, Reza M Salek, Susanna-Assunta Sansone, Venkata Satagopam, Daniel Schober, Ruth Shimmo, Rachel A Spicer, Ola Spjuth, Etienne A Thévenot, Mark R Viant, Ralf JM Weber, Egon L Willighagen, Gianluigi Zanetti and Christoph Steinbeck. The future of metabolomics in ELIXIR [version 2; peer review: 3 approved. F1000Research, 6(ELIXIR):1649, 2017. URL: