Hi everyone, i have a question...recently found de Ensemble DAtabase and comparing the sequences in the NCBI are diferent for the same gene ....in which can i trust? ...i always use the NCBI but now i have doubts.... thanks all!
I quickly checked in NCBI, EnsEMBL and UCSC annotation and it seems that there is several transcrits for FCGR3B (CD16B) and that all changes occurs on the first exon. So you have to check which transcrit you want (cf cell expression, development phase).
Or maybe you don't have to be too precise. You talk about amplification, if it's gene amplification, choose the longest annotation, if it's for mRNA detection, choose exons common to all transcrits.
Anyway, here are the limits of my knowledge, you can also contact directly the annotation teams, but as all transcrits I've seen are validated, they will not choose for you.
They are both good. In EnsEMBL you can have a special look to the Havana annotation which is Human-curated.
I had the same "trust problem" with the gene MC1R. Which was merged with TUBB3 in EnsEMBL, but not in NCBI.
In EnsEMBL I could have access to the detailed alignments that conduct to the decision to merge it with TUBB3 (only one sequence had a problem). I also emailed them and they told me that when a gene have only one exon, their algorithm give it a poor confidence.
I have the problem with CD16A and CD16B because the gene are similar, but in the NCBI shown a shorter sequence of CD16B than eEnsEMBL so ... i dont know what believe and change all my anotations for a specific amplification cycle.
I quickly checked in NCBI, EnsEMBL and UCSC annotation and it seems that there is several transcrits for FCGR3B (CD16B) and that all changes occurs on the first exon. So you have to check which transcrit you want (cf cell expression, development phase).
Or maybe you don't have to be too precise. You talk about amplification, if it's gene amplification, choose the longest annotation, if it's for mRNA detection, choose exons common to all transcrits.
Anyway, here are the limits of my knowledge, you can also contact directly the annotation teams, but as all transcrits I've seen are validated, they will not choose for you.
i do that ... but for the CD16B the refseq isn's reported yet ... and in NCBI shows a shorter sequence than EnsEMBLe for that reason i don't know is why is not reported equally in both databases
There are numerous such scenario ( Same genes differ in length NCBI / EBI ) as per my experience so I would suggest you stick to NCBI. There have been many collaborative efforts by Scientists from both institutes and one such example is CCDS database. Where both the parties (NCBI & EBI ) have agreed to Annotation !
CCDS database is being used by many platforms for Exome sequencing.
mRNA and protein sequence from UniPrtoKB and NCBI ReffSeq are aligned to the genome in the Ensembel annotation pipewline. Threfore there is no question about the sequence to be differ in NCBI and Ensebml. For clearing your doubt why the sequence is get differed, you can email to Ensemble at [email protected]
be aware that you also have the issue of version. One reference sequence today can change (slightly) compared to the one in the previous version. Those reference sequence maps are mind construct and we also should be aware that there is truly more than one map coexisting depending on each individual. This is likely a case when you have duplicated genes, (exemple. the amylase gene.)