The boxplot attached shows the proportion of presence-absences counts of species "GP" in different dune treatments (habitat types) with 3 replicates in each treatment, and between 2-4 years repeated measures (unfortunately I have an unbalanced design of AC = 4 years, CC = 3 years, DC = 2 years repeated measures).
One count is a dune-year, aggregated to dune treat across years and dunes.
From the boxplot it seems quite clear to me that the presence of GP should be significantly different for AC:CC and AC:DC if not for CC:DC as well, given they are 100%, 7% and 0% present in AC, CC, DC samples respectively.... .
I conducted binomial GLM on the presence absence (p/a) data using the formula in R:
glm(formula = p/a ~ dune.treat,
family = "binomial",
data = dat[dat$species == "GP",])
THE OUTPUT WAS:
Deviance Residuals:
Min 1Q Median 3Q Max
-0.78 -0.788 0.00005 0.00005 1.626
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 20.57 3964.63 0.005 0.996
dune.typeC -21.58 3964.63 -0.005 0.996
dune.typeD -41.13 8253.04 -0.005 0.996
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 55.637 on 40 degrees of freedom
Residual deviance: 17.397 on 38 degrees of freedom
AIC: 23.397
This output seems very strange to me. I also tried to run a mixed effect model to include year as a random effect to take into account the unbalanced repeated measures.
glmmPQL(
p/a ~ dune.treat,
random = ~ 1 | year,
family= "binomial",
data = dat[dat$variable == "GP", ]
)
Output:
Linear mixed-effects model fit by maximum likelihood
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | year
(Intercept) Residual
StdDev: 1.460122 0.5028493
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: value ~ dune.type
Value Std.Error DF t-value p-value
(Intercept) 30.87 274897.7 34 0.00011 0.9999
dune.typeC -32.23 274897.7 34 -0.00012 0.9999
dune.typeD -61.94 505281.0 34 -0.00012 0.9999
Correlation:
(Intr) dn.tyC
dune.typeC -1.000
dune.typeD -0.544 0.544
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.42e+00 -4.89e-01 -7.890e-13 6.215e-12 2.96e+00
Number of Observations: 41
Number of Groups: 5
I realise there is a huge standard error, and large DF, which is probable leading to p-value close to 1. But is this the 'true' result?- I don't really understand why this is happening or what I can do about it?
Is there an error in the models I am running?
What would be a more appropriate model for this data to test for differences between dune.treats?
Thanks
Tania