I'm making heatmaps for each species in my database to see which sites and years of data collection cluster by abundance, % coverage, and presence/absence for each species individually. Most species are rare though and their matrix (site x year caught) contain many zeros. Is one distance (or dissimilarity) method favored over another for each type of data? Why or why not? I'd also like to see how my environmental variables cluster on their own as well (temperature, salinity, etc..). Bray-Curtis is for community-based clustering, I think, while Euclidean doesn't work well with zeros, right?

Would seeing how multiple variables and/or species cluster together be worthwhile? Does this require a different cluster/distance method too?

More Nathan LaSpina's questions See All
Similar questions and discussions