I am using more than 20 predictor variables to predict a response variable. Predictor variables include both categorical and continuous variables. I prefer using CART for better data visualization. I am conducting research in a forest ecology.
Firstly, in the environmental data you could use RDA; Redundancy Analysis is a direct extension of multiple regression, as it models the effect of an explanatory matrix X (n x p) on a response matrix Y (n x m). This is done by preforming an ordination of Y to obtain ordination axes that are linear combinations of the variables in X. In RDA, ordination axes are calculating from a PCA of a matrix Yfit, computed by fitting the Y variables to X by multivariate linear regression. Note that the explanatory variables in X can be quantitative, qualitative or binary variables. Prior to RDA, explanatory variables in Y must be centered, standardized (if explanatory variables are not dimensionally homogeneous, i.e. in different units), transformed (to limit the skew of explanatory variables) or normalized (to linearize relationships) following the same principles as in PCA. Collinearity between the X variables should also be reduced before RDA.
In order to obtain the best model of RDA, explanatory variables can be selected by forward, backward or stepwise selection (like GLM) that remove non-significant explanatory variables.
Then, you should use Multivariate regression tree (MRT) is a constrained clustering technique. MRTs allow the partitioning of a quantitative response matrix by a matrix of explanatory variables constraining (guiding) on where to divide the data of the response matrix. RDA and MRT are both regression techniques, the former explaining the global structure of relationships through a linear model, the latter better highlighting local structures and interactions among variables by producing a tree model.
And for more detailed and yet more manageable output can be generated by using the wrapper from the function MRT() of the MVPARTwrap package in R language. Plus, this other function allows identification of discriminant species.
It's preferable to use both algorithms. If you are plannig to predict something, be careful with gaussian GLM's, they can surpass the reality. In this case non-linear models as CART are better. You can try Nonlinear Regression too...