I ran CAP on a data set of 36 samples (soil biodiversity data, bray-curtis similarity). These samples originate from 4 different fertilization treatments replicated in three different pots and sampled at three different time points (4x3x3), a classical repeated measures design. The major driver of differences in beta-diversity is the source soil (= pots, e.g. as shown in a principal coordinate analysis). In order to examine any underlying treatment effects, I used CAP to find axes through the multivariate data cloud that could separate my data according to the four treatments.

Choosing 10 out of the 35 PCO axes, I get a maximum discrimination with 100% reclassification rate. I thought great, exactly what I wanted to see. Now here comes the confusion. We later learned that 12 of the 36 samples were mislabeled with the wrong treatment information. But why did I get the 100% reclassification?

I re-ran CAP using the correct sample coding. To my surprise, CAP gave again a reclassification rate of 100%. Therefore, no matter if I use the wrong or correct sample-treatment assignment, I get the maximum discrimination. What is wrong? Is it possible that the repeated measures effect causes an artifact?

Any help is greatly appreciated!

Martin

Similar questions and discussions