I want to identify highly expressed genes in human corneal (normal) cell types. For that, I retrieved transcriptome profiles of different studies from GEO datasets and carried out till CPM normalization. In the PCA plot, there is inter-study sample variability. How to reduce or ignore this effect?