what are the different types of feature selection techniques and their application i.e., whether these feature selections techniques are used for both district and continuous data.
Feature Selection methods can be classified as Filters and Wrappers. One can use Weka to obtain such rankings by Infogain, Chisquare, CFS methods. Wrappers on the other hand may use a learning algorithm with a classifier like SVM or Random Forests to search and report optimal feature subsets. For eg. one may use a genetic algorithm(GA) to build a population of random solutions(feature subsets). Each such subset internally generates a reduced dataset which may be fed to SVM that returns a 10 fold cross validation classification accuracy(CVA) . The 10 fold CVA may be associated with the corresponding subset's fitness function.After the population is built GA takes over and keeps improving the fitness landscape.After a number of iterations one may hope to get an optimal feature subset.
First you should consider if a 'wrapper' or a 'filter' method is best suited in your situation.
- Filter methods look at the features not in context of your model, but checks some fixed thresholds or criteria for inclusion.
- Wrappers are essentially search algorithms that will add/remove features and try to optimize a feature set. I commonly use 'recursive feature elimination'.
Filter methods are simpler and faster, but wrapper methods can give you a set of features that are optimized for your model. In both cases, be careful to only use training data when doing the feature selection as it really is a part of building the model.
Finally:
- There are several models that have feature selection "built in", for example random forest. Training a random forest model will typically also output "importance" values for each feature.
- You should consider creating dummy variables from your categorical variables.
Feature Selection methods can be classified as Filters and Wrappers. One can use Weka to obtain such rankings by Infogain, Chisquare, CFS methods. Wrappers on the other hand may use a learning algorithm with a classifier like SVM or Random Forests to search and report optimal feature subsets. For eg. one may use a genetic algorithm(GA) to build a population of random solutions(feature subsets). Each such subset internally generates a reduced dataset which may be fed to SVM that returns a 10 fold cross validation classification accuracy(CVA) . The 10 fold CVA may be associated with the corresponding subset's fitness function.After the population is built GA takes over and keeps improving the fitness landscape.After a number of iterations one may hope to get an optimal feature subset.