AFAIK, Welch's test can be used for feature selection process; a large t-statistic value (in conjunction with a small p-value) would provide sufficient evidence that the distribution of values for each of the examined classes are distinct and the variable may have enough discriminative power to be included in the classification model.
Welch test has it's place in feature selection and it falls under the filter method of feature or variable selection. The feature mean for a class is tested against the feature means of the other class. Because the assumption of homogeneity is expected to be violated then the need for robust method such as Welch. A feature is relevant for predicting the response class if the p-value returned by the Welch test is less than a threshold. Now the multiple testing problem comes, the threshold say 0.05 cannot be maintained as its now equivalent to 0.05×p, where p is the number of features. The correction methods is to family wise error correction or false discovery correction. The standard is to set the family error rate at 0.05 or false discovery rate at 0.1. Thus, the significant relevant features are those with p-values less than fwer or fdr.
I could not get the background of the question and I understood the meaning by former two answers. I will withdraw my previous answer. First, variable selection method in the discriminant analysis is very easy to use regression analysis. We set objective values as 1/-1.I think you do not know this fact. In this case, the F test is used. There is no advantage of using t-test or Welch test. From 1970, many medical and statistical researchers tried to discriminate cancer and normal patients. We assume 100 patients with 10,000 genes. However, the statistical discriminant function is not helpful at all by medical researchers. Perhaps, they judged that statistical methods were totally useless. And there is also a foolish study to think that a gene with a large value is cancer gene by t-test. On the other hand, since statistic researchers can use high-quality data, there are many studies as Big data, but the results are not clear. For discriminant analysis, the misclassification number (NM) should be used as a research first, but NM has many drawbacks. So I developed a linear discriminant function (LDF) based on the minimum NM (MNM) criterion. I analyzed microarrays used in the paper published in Science etc. from 1999 to 2004 and solved it easily in 2015 in only 54 days. First of all, microarrays are all linearly separable data (LSD). That is, the two groups are completely divided into the high dimensional gene space. No one pointed out this important fact. Next, the LSD has a Matryoshka structure containing a linearly separable subspace in it. Among them, those with a small number of genes are called SM.
Then, microarrays can easily be decomposed into approximately 100 pairs of SMs and noise subspaces. In other words, "Big data analysis" can be broken down into problems of the small sample size of about 100 pairs. It seems that the previous two answers knew that there was research using the Welch test for variable selection. I think such research is totally meaningless. My results are detailed in my Springer book "New Theory of Discriminant Theory after R.Fisher (2016)" and Amazon "From cancer gene analysis to cancer gene diagnosis " (2017). Do not waste your valuable research time for nonsense research thema.
In statistical learning, every task depends on goal set before hand which implies if the goal is to reduce the dimension of dataset in a binary classification task the Welch test is one of the many alternatives. In fact the popular NCBI gene expression dataset repository uses Welch test for two class dataset and ANOVA for multiclass. Dhanunjanya can easily confirm from ncbi.nlm.gov./geo .The top 250 genes command makes use of Welch t test for gene ranking.
Your answer is very fresh for me. However, I have found several pairs of SMs that can completely separate cancer and normal patients or different types of cancers with nearly 40 genes. If the number of genes is 49 or less, Japanese diagnostic center can diagnose at 100,000 yen or less by blood. I think that the inspection of 250 genes may be expensive and useless. Please tell me your thoughts.
I forgot the important thing. I performed t-test of genes included in all SM. Then it distributes, for example, from -10 to almost 0, and around 10. In the Welch test, I think that it is wrong to make the possibility of cancer high for those with positive values. Please tell me what you think of negative and nearly 0 things.