Dear all,
I am trying to learn GSEA software tool and curently working on GEO dataset; GSE41586. When i downloaded the data and extracted it, I found many genes have duplicates as well as many gene scores/values were zero.
I wanted to know,
1. what is best way to remove duplicate row for gene names. Also, which gene names to removes; for eg: gene values is lower in one duplicate compared to others, etc.
2. Secondaly, some genes have all zero values for all conditions (3 biological replciates for control, 5-aza and 10-aza conditions). While some have zero values for 1 or 2 biological replicates. What is the best way to deal with these genes.
Regards,
Pratik