The background: I do research on stomach contents and have a dataset with many stomachs as samples (rows of dataset) and abundance for several prey categories in the stomachs (columns of dataset). I can group my data for different factors (e.g. year, season, size-class etc.) for example to test for differences in diet composition between years. I am using the R 'vegan' package.

My question: When I run e.g. a PERMANOVA (in fact the adonis2 function from vegan) on the raw data, means several thousand stomachs as individual samples, I got high significances but also low R2 values as the high number of residuals 'spoil' the model. When I summarise the data and THEN perform the multivariate statistic, I got lower significances but also higher R2 values, which is desirable (as they explain the contribution to the model). The problem here is, that sometimes I have only 1 degree of freedom (e.g. comparing only to years with each other) and then the statistic doesn't work at all.

What would be the right way to do, when dealing with such data? Going for one or the other way of structuring the data? Or go for something completely different, e.g. Kruskal-ANOVA?

Many thanks for any suggestions.

More Tobias Büring's questions See All
Similar questions and discussions