I ran CAP on a data set of 36 samples (soil biodiversity data, bray-curtis similarity). These samples originate from 4 different fertilization treatments replicated in three different pots and sampled at three different time points (4x3x3), a classical repeated measures design. The major driver of differences in beta-diversity is the source soil (= pots, e.g. as shown in a principal coordinate analysis). In order to examine any underlying treatment effects, I used CAP to find axes through the multivariate data cloud that could separate my data according to the four treatments.

Choosing 10 out of the 35 PCO axes, I get a maximum discrimination with 100% reclassification rate. I thought great, exactly what I wanted to see. Now here comes the confusion. We later learned that 12 of the 36 samples were mislabeled with the wrong treatment information. But why did I get the 100% reclassification?

I re-ran CAP using the correct sample coding. To my surprise, CAP gave again a reclassification rate of 100%. Therefore, no matter if I use the wrong or correct sample-treatment assignment, I get the maximum discrimination. What is wrong? Is it possible that the repeated measures effect causes an artifact?

Any help is greatly appreciated!

Martin

More Martin Hartmann's questions See All
Similar questions and discussions