I am trying to identify the promoter region of a gene based on comparison of the transcriptome sequence data to the genome data. How do I go about identifying the TSS? I have tried using softberry and PLACE but all are giving me different results.
correct me if I'm wrong but shouldn't we be looking a the upstream region of the gene instead of otherwise?
once we have aligned the 5`RACE product against the genome data, does the first nucleotide on the upstream region of the 5`RACE product considered the putative TSS site? or is there any other factors that we need to consider as well?
Thanks again Chung Ma for your help. Will read the papers that you've suggested in a while.
Thank you Martin for your in depth explanation. I'll definitely look for a kit that fits the criteria you've mentioned.
One more question, say if the TSS site have been identified, the upstream region would represent the promoter am I right? Correct me if I'm wrong but, by using the genome walking approach, I think there is a limit to where the genomic DNA can be amplified. If I were to use a genome data to look for the promoter, how far up do you think the promoter size would be? What about the neighboring genes? This is especially looking at plant genomes that have not been thoroughly explored and annotated like the Arabidopsis genome.
@Nurniwalis: You just stumbled upon the main problem with defining promoters: To my knowledge, there is no such general definition. There are certain elements which you can (but not necessarily have to) find upstream of a TSS. Promoter and enhancer elements can be many kb away, inside the target gene itself (sometimes in introns). These distant elements are not easy to find, usually this happens when you find a SNP which is associated with a gene far away.
Thank you very much Martin and Christian for these helpful information.
Sometimes I find it mind boggling thinking about how promoter actually functions especially those that are tata-less and no caat sequence to be found upstream the TSS for binding of the TBP. Of course there's the initiator element to consider but some do not have a consensus sequence so that's quite hard to identify.
I suppose at the end of the day, to ensure the functionality of the promoter, promoter:reporter studies need to be done for verification.