Which are the best feature selection algorithms ?

More Luigi Borzí's questions See All

How to assess the osmotic power of a substance?

Hello, I suspect that a molecule I am using is causing bacteria to experience osmotic shock. What chemical properties should I look for to compare this capability with those of other substances?...

06 August 2024 977 1 View

How to stain bacterial surface for electron microscopy?

I would like to investigate the surface of bacteria using electron microscopy to assess changes in specific conditions. Is there an established protocol for that? Thank you.

15 July 2024 2,338 2 View

How to assess bacterial permeability?

Is there an established protocol to measure the permeability of bacterial membranes? Thank you

30 June 2024 6,614 3 View

How to immobilize bacteria without morphology alterations?

I have a set of bacteria that I stained with fluorescent dyes to check for viability. The protocol suggests to place 5 ul of the suspension on a slide, cover with a cover glass, and observe on a...

16 June 2024 9,382 1 View

How to transfect environment bacteria?

I would like to transfect bacteria with some plasmids. However, the bacteria are not competent but either isolated from the environment or purchased reference strains. Can I transfect these...

16 June 2024 9,384 6 View

How to describe statistical values in words?

After performing statistical analysis, there is the need to report the p-values in the text. Is there an accepted (published) stratification range? For instance, is 0.04 'slightly' significant?...

28 May 2024 7,724 3 View

How to assess if Kd values (constant of dissociation) are meaningful?

I used Microscale thermophoresis(MST, from Nanotemper) to determine the constant of dissociation (Kd) between a protein (20 nM) and a ligand (1:2 serial dilution from 50 μM to 1.5 nM). I ran the...

04 May 2024 2,066 0 View

How to stain live bacteria for real time tracking?

I would like to track in real time the growth of bacteria using fluorescent miscoscopy. I remember there was at least one plasmid that could be used: transfected bacteria would emit a red or green...

29 April 2024 7,095 3 View

What magnification for organoid imaging?

Hello, I would like to take some publication-quality pictures of organoids. What magnification shall I use? Do I need a 60x or is a 40x enough? Thank you

15 April 2024 8,471 1 View

How to sterilize pipette without autoclave?

Hello, I think I am using a 5 mL pipette that is contaminated with bacteria. Nothing serious, just the usual environmental microbes. The problem is that the tips for the 5 mL pipette are without...

07 April 2024 6,716 1 View

How to determine the position of occupancy of the dopant? - whether it is doped in tetrahedral or octahedral site?

Suppose a material "A" has both tetrahedral and octahedral sites and we are doping another material "B" - usually an ion into it. How can we detect if the dopant has occupied the octahedral site...

17 July 2024 4,299 4 View

What are possible sample selection biases in Logit model estimation?

I randomly interviewed 250 poor people and 250 non-poor people. Considering 1 for poor and 0 otherwise, does estimating a logit model aiming to capture the probability of becoming poor make sense?...

02 June 2024 2,326 2 View

Why do I get no melt curve peak and no Ct for some of my patients but not others, in a SYBR green qPCR?

I am doing a RT-qPCR for gene expression analysis of cancer patients, and there are four different groups in the results. The first group that shows over expression of my target gene (both two...

31 May 2024 1,088 0 View

How to calculate the angular velocity of a rotating cell from a microbe video?

I have a video of Brownian motion of microbes under a microscope. From this video, I need to calculate the angular velocity of a particular prominent cell. From my understanding, what I know is...

18 May 2024 4,289 3 View

During variable selection to enter into the multivariable analysis, is that a must to run a bivariable analysis when the number of variables are < 10?

I want to select variables to enter into the multivariable analysis by doing a bivariable analysis but the number of variables are limited,so is there any ground rule in such condition?

13 May 2024 7,816 1 View

How can ERBTC address cultural beliefs and gender dynamics to reduce transfusion-transmissible infections and minimize blood discard rates?

Based on the study "Challenges facing Blood Transfusion Services at a Regional Blood Transfusion Center in Western Kenya" conducted by Kavulavu et al. (2022), blood donors at ERBTC are typically...

11 April 2024 8,213 0 View

How to choose variables in regression analysis?

I want to perform an analysis using Poisson/negative binomial regression. There are 90 observations and about 20 variables(predictors). I read somewhere that there should be at least 10...

09 April 2024 3,907 7 View

Sexual selection and speciation – unpublished data?

Dear Colleagues, We are conducting a meta-analysis to synthesize comparative studies investigating the relationship between proxies of sexual selection and species richness or speciation rate in...

07 April 2024 5,425 0 View

Are features that do not co-occur with the output variable (of a complementary distribution/(-)ve point-wise mutual information) deemed irrelevant?

The fact that a feature is of a complementary distribution does not seem to be a sufficient reason to discard the feature as irrelevant; especially as they seem phenomenologically relevant.

21 March 2024 8,437 2 View

With the current encoders, is it possible to reconstruct an image from its feature vector?

Given a feature vector is it possible to reconstruct the original face image? I read about the reconstruction of MNIST, but what about face images?

14 March 2024 8,434 3 View

Kai Heinrich

It really depends on the actual task and data (e.g., do you have image data or tabular strucutred data etc.). With deep learning methods at the base, usually the artificial neural network (e.g., CNN) will take over the job as feature extractor and generator). For other purposes and applications the following papers might help:

Article Performance Comparison of Feature Selection and Extraction M...

Conference Paper Comparison on Feature Selection Methods for Text Classification

Conference Paper Comparison of feature selection methods for machine learning...

Article Feature selection and dimensionality reduction: An extensive...

Qamar Ul Islam

Dear Luigi Borzí

Generally, There are five feature selection algorithms:

Pearson Correlation. This is a filter-based method.
Chi-Squared. This is another filter-based method.
Recursive Feature Elimination. This is a wrapper based method.
Lasso: Select From Model.
Tree-based: Select From Model. This is an Embedded method.

You can see the links for more info:

1. https://towardsdatascience.com/the-5-feature-selection-algorithms-every-data-scientist-need-to-know-3a6b566efd2

2. https://www.machinelearningplus.com/feature-selection/

Kind Regards

Ferdib Al Islam

You can check these articles:

https://machinelearningmastery.com/feature-selection-with-real-and-categorical-data/

https://www.analyticsvidhya.com/blog/2020/10/feature-selection-techniques-in-machine-learning/

https://www.datacamp.com/community/tutorials/feature-selection-python

Luigi Borzí

Thanks to all of you, I appreciate it.

Abdelhameed Ibrahim

These articles can help you a lot

Article Advanced Meta-Heuristics, Convolutional Neural Networks, and...

Article Novel Feature Selection and Voting Classifier Algorithms for...

Dan Harris

Allow me to depart from the norm on this topic. While ML algorithms can help eliminate features as non-useful because they do not correlate with object classification (non-correlated implies non-causal/non-informative), it ought not be the only thing employed in selecting useful features (correlation does not imply causation/informative).

I personally use ML algorithms to narrow down all the possible features to those that are potentially causal and informative (saves tons of time), then from a scientific point of view I determine if the features fit a causal model for why certain objects exhibit certain behaviors captured in the feature space BECAUSE the object is what it is. Lastly, I use only those informative features that have causal explanations and abandon the rest until they do have causal explanations and are informative.

If you have no causal explanation you cannot determine with correlation alone if the value you seem to be getting out of the feature is causal or data bias (making it appear valuable, but it's really not). In low-risk applications the eventual poor behavior of using a non-causal feature may not be devastating, but in high-risk applications it can be. In certain low-risk applications it can better to use features without causal explanations than not using any features at all, but if you can find causal features, my advise is to do so and use them.

So to answer your question, you are the best feature selector, and leverage ML algorithms to narrow your focus to potentially causal and informative features.

Check out this source for more details on the causal approach to ML: https://ieeexplore.ieee.org/document/9438325

Carlos Araújo Queiroz

Maybe you can consider the recursive least squares algorithm (RLS). RLS is the recursive application of the well-known least squares (LS) regression algorithm, so that each new data point is taken in account to modify (correct) a previous estimate of the parameters from some linear (or linearized) correlation thought to model the observed system. The method allows for the dynamical application of LS to time series acquired in real-time. As with LS, there may be several correlation equations with the corresponding set of dependent (observed) variables. For RLS with forgetting factor (RLS-FF), acquired data is weighted according to its age, with increased weight given to the most recent data.

Years ago, while investigating adaptive control and energetic optimization of aerobic fermenters, I have applied the RLS-FF algorithm to estimate the parameters from the KLa correlation, used to predict the O2 gas-liquid mass-transfer, hence giving increased weight to most recent data. Estimates were improved by imposing sinusoidal disturbance to air flow and agitation speed (manipulated variables). The power dissipated by agitation was accessed by a torque meter (pilot plant). The proposed (adaptive) control algorithm compared favourably with PID. Simulations assessed the effect of numerically generated white Gaussian noise (2-sigma truncated) and of first order delay. This investigation was reported at (MSc Thesis):

Thesis Controlo do Oxigénio Dissolvido em Fermentadores para Minimi...

Mahit Kumar Paul

You can visit this article:

http://romisatriawahono.net/lecture/rm/survey/machine%20learning/Chandrashekar%20-%20Feature%20Selection%20Methods%20-%202014.pdf

Md Babul Islam

Have a look below link.

https://www.quora.com/How-do-I-choose-the-right-feature-selection-algorithm

Nestor Barraza

No one, an expert on the domain should say, a simple one rule can perform quite well sometimes, see a discussion attached