I am creating resource selection functions to examine habitat selection using radio telemetry data and 1:1 paired logistic regression models. Habitat features at each telemetry point are paired with a single measure of the availability of those features that is unique to a particular telemetry point. I first fit models using R's glm function by differencing the used and available points and fitting a no-intercept glm:

glm(point ~ -1 + diffx1 + diffx2 + diffx3 ... diffxn, data, family=binomial)

I have also fit the same model using the survival package's clogit/coxph functions. In this case, each 1:1 pair of used/available points is grouped into a stratum:

clogit(point ~ x1 + x2 + x3 ... xn + strata(pair), data)

I have 16 covariates and some of these are correlated. However, when I calculate pair-wise correlation coefficients and variance inflation factors I get different results depending on whether I use the differenced data used to fit the glm or the original data used to fit the clogit/coxph. Both glm and clogit/coxph give me identical results but the different degrees of collinearity have a strong influence on the resulting analyses.

Does anyone have a recommendation of which form the data (differenced or original) I should use for assessing collinearity.

Many thanks!

More Javan M. Bauder's questions See All
Similar questions and discussions