While collinearity among explanatory variables is acknowledged in case of multiple linear regression, for me still remains a question mark in case of nonlinear regression.

I am working with a multiplicative nonlinear model Y~(X1^a1)*(X2^a2)*…*(Xn^an)

Some of the explanatory variables are linearly related, some are even nonlinearly. The interactions are complex and data are not orthogonal, distribution of explanatory variables not being uniform and rather right-skewed.

I thought that fitting first the most complex model (using all the explanatory variables) and then drop them in the order dictated by their significance will make a good strategy of obtaining the most parsimonious model - as explained by Crawley (2007). Same author underlines that order of dropping variables matters when dealing with non-orthogonal data. What is not clear for me is: does it matter in any kind of model, linear or nonlinear, or is just valid for the linear ones? What if following strictly the significance levels I risk to drop 'good' explanatory variables due to 'ill-conditioning' (see below). Given the 'ill-conditioning', even if I obtain convergence for a complex model (with many correlated explanatory variables), but not necessarily significant estimates, does that model make any sense to start the simplification process with?

Though I cannot provide references, it came to my attention that I should not be afraid of collinearity in case of MULTIPLICATIVE nonlinear models…On the other hand Seber & Wild (2003) say ‘multicollinearity in linear models can lead to highly correlated estimates and ill-conditioned matrices […] Unfortunately, all these problems are inherited by nonlinear-regression models’ with the added complication that the confidence contours are curved. They end with ‘However, with nonlinear models ill-conditioning can be a feature of the model itself […] good experimental design can reduce the problem, but it may not be able to eliminate it’. This is rather scary.

If anyone dug deeper into the issue of model simplification of nonlinear models, considering non-orthogonal data and collinearity, any advice would be of great help.

References:

Crawley, M., 2007. The R book. John Wiley & Sons

Seber, G.A.F., Wild, C.J., 2003. Nonlinear Regression. John Wiley & Sons

More Valentin Ștefan's questions See All
Similar questions and discussions