I have a microbiome dataset, which I am using to compare different groups of mice (symptomatic animals, asymptomatic animals and a control group). I've done some preliminary analyses with this dataset, but detected a large degree of variability among individual animals, which I fear, hinders the identification of group-specific profiles.
To overcome this problem, I performed some comparisons among the different groups using their respective core microbiomes and got much better results.
To accomplish that, I used the script compute_core_microbiome.py , from QIIME 1.9.1, to generate individual core microbiomes for each group (with an 80% prevalence). Next, I merged these core microbiomes in a single OUT Table, using the command merge_otu_tables.py and performed a beta diversity analysis (NMDS-Permanova, using MicrobiomeAnalyst) to compare the three groups. This approach resulted in perfect separation of animals in three groups, according to their respective conditions (symptomatic animals, asymptomatic animals and control).
However, I feared that this separation might be artefactual, since the core microbiomes would naturally emphasize similarities within each group. To test for that, I distributed my animals into randomly balanced groups (each containing a similar amount of animals from each of the original groups) and reran the entire analysis, as described above. This resulted in no clear group separation, which gave me some confidence that my original group separation was not artefactual (especially after I obtained the same null results with three different sets of randomly balanced groups).
Although I am cautiously confident of my results, I have not found any papers in the literature that employed this exact same approach, which keeps me wondering if there are any alternatives to conduct my analyses.
For example, wonder if I should I generate a single core microbiome for all animals, before comparing the groups, but I fear that such approach would result in another type of bias, which is emphasizing the similarities among the groups, instead of highlighting their unique features.
In summary, my biggest concerns at this moment are: (i) what is the validity of comparing groups using their individual core microbiomes? (ii) Would the use of randomly balanced groups be enough to rule out possible biases introduced by this approach? (iii) Is there a better alternative?