Pointers for QC and analysis of metabalomics UPLC-MS/MS data

07 March 2016 1 2K Report

I have a dataset generated via Ultrahigh Performance Liquid Chromatography-Tandem Mass Spectroscopy (UPLC-MS/MS).

It is a table of integers which are the readings for 700 samples times 1300 metabolites.

It looks like count data (when plotting histograms).

However when I plot mean against variance for the 1300 metabolites (so a 1300 point scatterplot) I find that mean variance.

In fact, mean = standard deviation is a good fit,

that implies each metabolite follows a gamma distribution.

That's as far as I've got.

Does anyone know of any good reviews explaining UPLC-MS/MS data? How to QC it? R libraries?

Ultimately, how to use them as predictors of a phenotype?

Any help would be appreciated as I'm a total beginner with this type of data.

Biswapriya Biswavas Misra

Hi Desmond,

Majority of the freely available tools, software, and databases from 2014-2015 for your analyses have also been recently summarized/ reviewed by us at Updates in metabolomics tools and resources: 2014–2015 (http://onlinelibrary.wiley.com/doi/10.1002/elps.201500417/abstract), please check out the Table-1 and find your tools!!! : )

Self promotions apart, : ) some of the good UPLC-MS/MS data have been published by Professor Lloyd Sumner (https://scholar.google.com/citations?user=uaN9_BUAAAAJ&hl=en) and can be a good starting point to see how QC was done for their publications.

I am surprised with the number if metabolites at 1300 (Are they really IDed?) or just m/z' s? If assumed 'identified' then a lot may be "junk" and I or any one interfacing with biological mass spectrometry/ metabolomics would be able to 'weed them out'. Happy to collaborate. Also, counts need to be normalized, scaled, log transformed (or other methods) and so on to see the real effects than counts!

QC wise, (i) you can check for batch variations for example using 'limma' package on R or say normalyzeR or LOESS-N normalization and so on; (ii) internal standard/ QC-pooled runs or even blank runs, based normalization would also tell you the distribution of data/ values, (iii) TIC values and distribution and so on- both pre-processing and statistical tools would let you know the quality of the runs.

I would suggest to run the data through simple WebServer as MetaboAnalyst http://www.metaboanalyst.ca/ or say, WorkflowforMetabolomics: http://workflow4metabolomics.org/ or say XCMSOnline https://xcmsonline.scripps.edu/landing_page.php?pgcontent=mainPage to see the data quality, trends, basic stats, and so on before starting with R and playing with options one by one. Of course bioconductor/R and Python or even Matlab or Galaxy would lead you into a big and different world of analysis of metabolomics data sets.

In our review you can go to the 'holistic tools' section to find many of these press-button software suites to get started and reduce data - in terms of removing "obvious" outliers (say, those which do not fit in PCA at all!) from multivariate analysis tools, which do not have significant changes, IDs which do not make sense (MB Role ID conversion, or say CTS:

Best wishes.

Hope it helps.

Thanks,

Biswa

Badges
Science method

More Desmond Campbell's questions See All

Do you think can be any Uranium bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about Uranium ore deposits in world.

11 August 2024 6,720 0 View

Do you think can be any diamond bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about diamond ore deposits in world.

11 August 2024 2,167 1 View

What is the difference between mathematical R^4 space and physical 4D unit space?

We assume that the difference is huge and that it is not possible to compare the two spaces. The R^4 mathematical space considers time as an external controller and the space itself is immobile in...

10 August 2024 6,678 14 View

If Banks do not provide credit facility, what are the options available for FPOs and impact on producer’s income?

10 August 2024 8,198 5 View

Controlling for pupil light reflex when analyzing pupil size time course?

I used eye tracking to examine how participants from two different populations (A and B) react to an image. Participants in population A exhibit larger pupil sizes over time, but they also have...

10 August 2024 3,229 0 View

What are a “Farmers Producer Organization” (FPO) and its essential features?

10 August 2024 477 5 View

Strugglling with m6A dot blot any suugesstion ?

I have been doing the m6A dot blot for a while with no improvement, I am extracting the RNA, and I can see the dots although the three biological replicas give a different reading on the memberan...

10 August 2024 8,539 5 View

Do interactions between biosphere, carbon cycle, & water cycle impact global warming & interaction between atmosphere & hydrosphere?

How do interactions between the biosphere, the carbon cycle, and the water cycle impact global warming and interaction between the atmosphere and the hydrosphere?

09 August 2024 3,291 2 View

How to get moment output in Abaqus Standart?

I have input a moment load in module load Abaqus, i put my moment load on the node surface (using reference point). I have define moment in history output and make a set for moment too. But the...

08 August 2024 4,831 4 View

How is energy cycled through the Earth's climate system and how do matter cycle and energy flow through the rock cycle?

08 August 2024 8,162 0 View

Can you connect an HPLC to a Mass Spec only at a certain time point?

Can anyone explain this method? Especially the last statement where it says only at 1.5 to 2.5mins was the MS/MS connected to the UPLC. How is that possible, is it a feature in this specific...

11 August 2024 8,141 3 View

Is this a facetotecta nauplius?

This larva was captured using a plankton net in the Persian Gulf during the summer. I believe it may be a Facetotecta nauplius.

08 August 2024 3,746 4 View

How can major oxides determined by ICP-MS/ES be normalized to 100% volatile free? How does this connect with LOI?

I am analyzing cherts from different sources found within one specific geologic formation using ICP-MS/ES. My committee was asking this question about my analysis. I was not sure how to answer this

05 August 2024 7,902 3 View

Usage of internal standards in LC-MS/MS analysis?

Have you ever seen a LC-MS/MS method uses both internal standards and external standards (in matrix matching purpose) but the concentrations of internal standards are outside the calibration curve...

05 August 2024 3,084 6 View

I need the datasets of Microgrid for system identification?

Hi I am working on data driven model of the microgrid, for that, i need the reliable datasets for the identification of MG data driven Model. Thanks

02 August 2024 5,748 4 View

How to select relevant metabolites from multiple annotations ?

I am using the Polyomics Integrated Metabolomics Pipeline (PiMP) for metabolite feature analysis. After annotating my data, I have encountered cases where multiple potential metabolites are...

24 July 2024 5,789 0 View

China (Qinghai) flowers identification?

Can someone help me in identification of some flowers observed in Tibetan plateau (Qinghai province)? See the attached pdf. Sincerely Armando

18 July 2024 5,125 4 View

MS/MS spectrum with 5 Da difference in m/z value?

Hi everyone, We experienced a strange pattern in some spectra coming from HRMS/MS analysis. In particular, in the MS1 spectrum a base peak at m/z=M and a smaller peak at m/z=M+5Da appear, but in...

08 July 2024 3,631 4 View

Increasing Area Ratio on Quality control Standard on LCMS ?

Hello everyone. During my analyses on LC MSMS, the area ratio (Standard Area/Isotopically labelled Standard Area) increases with time for the quality control point. I'm having trouble explaining...

08 July 2024 3,150 2 View

How am I supposed to find justifications for the unusual fatty acid profile in broiler diet?

I've obtained a fatty acid profile of broiler diet comprised of SFA 48 to 93%, MUFA (22 to 40%), PUFA (non determined to 10.4%. The PUFA composition is mainly n-6 and almost no value for n-3. The...

08 July 2024 9,165 4 View