Hi there. I have some data from RNA-seq experiments. In PCA, we observe that the dominating patterns in our data are explained by other variables, rather than by those we used to classify our samples. Hence, our predefined groups are not visible in the PCA representation (they are mixed up). However, in the cluster heatmap, samples are clustered according to our variable, and is clear that the expression vectors (the columns of the heatmap) for samples within the same cluster are much more similar than expression vectors for samples from different clusters.

Given that in ALL the heatmaps cluster samples correctly according to our variable, is it possible that PCA is filtering out information that is meaningful to our samples?

How should I interpret these findings?

Thank you

More Angel Martin Bastida's questions See All
Similar questions and discussions