Hi there,
I'm working with feature selection and I'm curious about possible ways to determine the number of features (n) to be selected.
In my experiments, the optimal value of n heavily depends on the data set. Several papers use some unclear rule of thumbs like sqrt(n) or log_2 n, but I couldn't find any reasonable justifications for such choices.
Any insights on this?
Kind Regards.