How to select the best predictor in a cluster of variables containing overlapped information, like ESR, CRP and other inflammatory indexes, in the logistic regression model? is there any guideline
Take a look at what we did in a similar situation. Your overlapping predictors could be a problem but adaptive lasso should handle that. Take care, David Booth
The question, what is the "best" predictor (from a given set of candidate IVs) may be addressed in a number of ways, and the resultant answers might not be comparable.
Best single predictor, in the "crude" sense? Run all candidate IVs individually, and choose that which yields best result (lowest log-likelihood, or best classification accuracy).
Best single predictor, in the "adjusted" sense? (1) Run an analysis with all candidate IVs included. (2) Run k analyses, each omitting one of the k candidate IVs. (3) Make decision based on largest change in log-likelihood between run #1 and the respective runs in step #2 (or, biggest improvement in classification accuracy). Do note that, if there is a lot of overlap among the candidate IVs, you might see little or no change due to the last (omitted) IV.
Perhaps you had in mind a different definition for "best." If so, you'd likely get a more constructive answer if you could explain what exactly that was.