Eventually, I found the source of the problem - the way the function is written, it tries to find a initial value for the optimization from auxiliary GLM probit estimation (with the first elicitated response as dependent variable). However, in doing so, it takes into the account all the variables in the dataset you specified in the function call (i.e. the formula for this auxiliary estimation is is in fact "~ . - 1" ), not just those variables specified in the formula expression in the function call!