I have a dataset consisting of proportion variables as independent variables. I need to run a linear regression however there is the issue of multicollinearity. I've read that using a centered log ratio transformation can fix the problem but I have no idea how to implement in R. Here's what I've done so far.
#My table
a = data.frame(score = c(12,321,411,511),yapa = c(1,2,1,1),ran=c(3,4,5,6),aa=c(0.1,0.4,0.7,0.8),bb=c(0.2,0.2,0.2,0.1),cc=c(0.7,0.4,0.1,0.1))
library(compositions)
dd = clr(a[,4:6]) #centered log ratio transform
summary(lm(score~aa+bb+cc,a))
summary(lm(score~dd,a))
but I get the same result essentially with the last variable being omitted because of multicollinearity.
There is an alternative that does work if I introduce jitter in the variables aa,bb,cc, however I need something that can directly be implemented in the lm function because I use other variables in my real dataset as well.
library(robCompositions)
lmCoDaX(a$score, a[,4:6], method="classical")
Anyone has any experience with these type of data?