Should a feature selection method to remove redundant features always be used or is it only effective when a certain dimension is reached?

Fernando Herrero Carrón Popular answer

Dear Eric,

Redundant features add no relevant information to your other features, because they are correlated or because they can be obtained by [linear] combination of other features. Having them on your set will not add anything, but it won't hurt either, information-wise.

It will, however, hurt your training and classification times. Any limits or guidelines? You name it! Can you put up with the little extra time to train your classifier using that additional feature? Then leave it. Does it take you one week instead of one day? Remove it!

Bear in mind, however, that some algorithms may have different complexity over the number of features than over the number of samples. Consider which of them hurts you most, and then choose your strategy: reducing dimensions AND subsampling are both pertinent.

To sum up: make your estimation in terms of time and balance "preprocessing effort" vs "training/classification time".

Hope that helps.

Cheers,

Fernando

Anshul Joshi

What algorithm (or method) are you using to classify the data?

http://www.ncbi.nlm.nih.gov/pubmed/15572470

Eric Nallon

I am running my final test feature vector through various classifiers (KNN, SVM, LDA, decision tree, Naive Bayes) using a Grid Search cross-validation method. I am using python and mostly the sci-kit learn module.

Fernando Herrero Carrón

Dear Eric,

To sum up: make your estimation in terms of time and balance "preprocessing effort" vs "training/classification time".

Hope that helps.

Cheers,

Fernando

Jeffrey M Girard

I have found that dimensionality reduction techniques, especially simple linear ones like PCA, can hurt classification performance. However, this may be specific to my data and other methods for feature extraction and classification. The scientific approach would be to test it in your data by doing it both ways!

Volker Lohweg

Dear Jeffrey, dear Eric

its's not your data! I wrote this several times in different RG blogs and forums. The use of PCA for feature reduction is not a good method because PCA tries to find the "best" eigenvalues in the sense of variances. That does mean, that you try to extract the most dynamic one, but this does not necessarily mean, that these features are the most prominent ones! Assume a little example: Your feature are overlaid with noise Which is normal in real-world-applications then PCA will generate you Eigenvaluses (features) which react "good" on noise, but you would like to have are stable features. Therefore, PCA ist under practical reasons not a good reduction method.

Models for reduction based on LDA or the like are much better to handle and give better results than PCA.

Hope it helps ....

Feedback defines the constitution of an organism?

How to learn more about SPSS and its Application?

Is there a problem with my RNA pellet?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

RNA Extraction Using Hot Borate Method No Longer Working?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

Is Galaxy.org good to use for research for analyzing data and for publication?

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

Measuring the Intelligence of a Species?