Basically we have 140 cases and 30,000 controls, the feature number is 60,000. We want to reduce the feature number to less than 100, and then do modeling using selected features. Any suggestions wouldl be very appreciated.
Would it be possible to perform some for of a feature clustering (dimensionality reduction) instead of picking a subset of features? You might be able to retain as much as information that way (of course there will be some loss due to the lower dimensional representation plus the difficulty in interpreting those latent dimensions as opposed to the original features) but worth a try. As for feature selection methods have you tried the typical approaches such as PMI or Chi squared statistic? Are there any issue with these standard approaches in your setting? (imbalanced classes and small sample size)
Would it be possible to perform some for of a feature clustering (dimensionality reduction) instead of picking a subset of features? You might be able to retain as much as information that way (of course there will be some loss due to the lower dimensional representation plus the difficulty in interpreting those latent dimensions as opposed to the original features) but worth a try. As for feature selection methods have you tried the typical approaches such as PMI or Chi squared statistic? Are there any issue with these standard approaches in your setting? (imbalanced classes and small sample size)