11 November 2013 6 5K Report

I am currently doing a project where we grow various cancer cell lines and then treat (or not treat) them with a drug which causes growth inhibition. After dosing for the appropriate time, we lyse the cancer cells and extract their RNA, check the quality and quantity and then have Affymetrix done on it to obtain a gene expression profile for the treated vs. nontreated cells. This gives the expression levels of over 20000 genes and I would like to know how to choose the genes that "really matter" - i.e. the ones most likely to be involved in the mechanism by which the drug caused the growth inhibition.

I have a few preliminary ideas but as I am new to this kind of work I'm sure there are better ways to go about it! They are:

* Cutting down the list of genes using p-value or fold change (this was recommended to me by my supervisor, however I do not understand where you would get a p-value from?)

* Bayes factor (again a recommendation, and that I should obtain it from GATHER at http://gather.genome.duke.edu/ however I am not sure what this factor means in the context of selecting important genes?)

* With a final list of perhaps up to 100 genes, I would still need to cut down as we intend to validate the Affymetrix results, however we are limited by the number of genes we can validate. From what I know so far, the final selection would be a matter of manually reading up on every single gene and trying to find relationships between them to inform which ones to choose. But this seems very inefficient, and surely there is software or something developed by the bioinformatics community that can help with this?

* Are there any other things that could be of use apart from those I have mentioned?

Similar questions and discussions