The rfe functions in the caret package allow to perform recursive feature selection (backward) with cross-validation.

It is expected that the best features selected in each fold may differ, as also stated in the caret webpage (http://topepo.github.io/caret/recursive-feature-elimination.html)

"Another complication to using resampling is that multiple lists of the “best” predictors are generated at each iteration. At first this may seem like a disadvantage, but it does provide a more probabilistic assessment of predictor importance than a ranking based on a single fixed data set. At the end of the algorithm, a consensus ranking can be used to determine the best predictors to retain."

However it is not clear to me how the final "best" set of predictors is chosen in rfe, considering this expected heterogeneity among folds. I cannot find the procedure of the "consensus ranking" mentioned above.

Thank you for you help!

More Massimiliano Grassi's questions See All
Similar questions and discussions