If only gene expression data is available and we want to abstract mutation information of those samples/patients using their gene expression information. Is it possible? Is there any algorithm for that?
Not sure if I understand the question, but a classical example is the expression of Lactase, and thus lactose intolerance when lactose is consumed when little expression in the enterocyte. If you drink lactase-containing dairy products no lactose intolerance happens.
Lactase is normally expressed in the intestine of breast-feeding mammals, but upregulating mutations in the lactase gene promoter have loosened gene expression to occur beyond breast feeding age even until adulthood when no breast feeding takes place. Geographical places where such mutations were selected appear to be populations that feed on bovine milk or have dairy farms.
If the example is correct, there should be a good number of references supporting expression with mutations in the promoter/enhancer of the human lactase gene.
However, data is not based on algorithms or any other shortcut.
only experimentations (sequencing, sanger or NGS) can give you mutation status on genes of samples. noway any bioinformatics tool to do it, since expression sure can be due to mutations (but which???), but environmental influences too...
Hi. In fact, some databases like SRA, ENA, Array express and GEO release sequencing and microarray data, you can download your desired data set and analyse them by some tools like POLYPHEN and SIFT for evaluation of mutation on genes function. After that, for assessment of mutation on protein function, you can employ molecular dynamics simulation, docking and residue scanning.
on the other hand and as Mohammad Mahmoudi Gomari said, you can explore databases as the UCSC website (http://genome.ucsc.edu) where you can visualize data (by the genome browser) or get the informations (by the table browser) or use some of the tools to get forward your project.