Hi all,
I am running a Random Forest –Mean Decrease in Accuracy algorithm for feature selection on my Microarray data in order to use the selected genes as a classifier to discriminate between 2 classes of cell lines. I am having problems to interpret the output information given by the algorithm. It gives me a small list of selected genes and for each gene there is a Pearson correlation value, a fold change value and a q-value (False Discovery Rate) .
The variable “class” is discrete (normal vs disease), so what does the Pearson correlation mean in this case?
Should I take the q-value showed as a multiple test correction and give less importance, or exclude, the genes that showed a q-value (FDR) higher than 0.2 (or any other pre-determined cut-off for significance)?
I would appreciate any suggestion on how to interpret the results of a RF-MDA for feature selection algorithm.