01 January 2016 10 6K Report

Hello,

When generating phylogenetic trees how crucial is it that the nucleotide sequences to be compared are the same length? When I say this I am NOT referring to gaps in the sequence to be compared, but rather at the ends of the sequences, eg:

if sequence A is 1000 bp, and sequence B is 1100bp, but the extra sequence is 50 bp at either end.

Currently I am trimming to the shortest sequence in the analysis but I'm wondering id this is too stringent? Does this 'extra' sequence at the ends (or lack there off) get treated simply as missing data, or as a deletion/insertion?

Is the consideration of extra sequence model dependent? or is it dependent on how much extra sequence is present?

I'd be grateful for any help or opinions?

Thanks,

Adam

More Adam J Bell's questions See All
Similar questions and discussions