In order to associate species to functional groups based on a set of traits I want to carry a dendrogram using hierarchical clustering applying the option Unweighed pair group average (UPGMA).

I am using a two-way clustering algorithm, which will cluster both species and traits. Moreover I want to have an idea of the robustness of this clustering. However the softwares I use will build the dendrogram based on a summary data matrix. This means that each column (functional trait) and row (species) will constitute a branch on the dendrogram. Branches are paired in relation to data in the rows for each column and the chosen distance or (dis)similarity.

A bootstrapping option is available to provide a % of times each branch still remains when removing randomly a columns. However this is wrong at least for the analysis I am carrying, because the matrix, is a resulting matrix from a much bigger matrix, which means that removing a column does not have sense, because it summarizes several observations. I assume that each column is ok, and what I want to know is how robust the dendrogram is when randomly removing an observation. The algorithm should remove randomly observations from the data matrix (which is a 1024 samples x nb of species=20), and join (summing up) the data of similar categories (same trait-species) to rebuild the cluster, and then inform about how robust the dendrogram is.

Does such a function exist on a software? Moreover if it does not, and I have to create the function, I wonder what criterion is the most adequate to evaluate the final cluster (for example the % of times each branch is kept as in Pats), or perhaps an information criterion. Then if my set of species is let’s say as big as 60 species, the best of the best will be to have a backward function to eliminate species until building a cluster robust for a expected probability (e.g p

More Alex Salas-López's questions See All
Similar questions and discussions