I have 16S data to look at the microbial communities of two types of hot springs; I've been working with this in R. Some of the data is from 2017-2019, but the bulk of the data is from last year. There's 18 samples from 11 individual hot springs of type 1, and 31 samples from 6 individual hot springs of type 2. 24 of the 31 samples of type 2 are triplicates of 8 sampling sites. No other sample has a replicate. So, this data is very skewed. I tried doing a PCoA to see whether the two types of hot springs distinctly cluster from each other, but, although they seemed to, the axes numbers were only about 5% and 8%. I tried a CAP plot and it was similar. Is there a better way to visualize clusters? Should I be transforming the data (log2 transform or relative abundance transformation) before doing ordination plots? There are definitely differences in the microbial communities of the two types, I can see it in the bar plots of what organisms are present. The data was transformed after I did ordination and before I did bar plots.

My PI says that I need to account for the n=3 samples for the 8 sites that were sampled in triplicate, as they are skewing the data. I suggested either just using the first sample from each site, random subsampling, and merging each triplicate into one samples and all suggestions were veto-ed. What other method can I use to account for this?

More Teresa Mccarrell's questions See All
Similar questions and discussions