I have two protein sequences with around 50% identity between them.
I want to study the phylogenetic relationship between them.
I came up with a method myself:
Step 1: blast each of the sequences to a protein database separately (possibly with less stringent thresholds)
Step 2: extract the subject sequences which are hits common to both blasts
Step 3: multiple sequence alignment using the common subject seqeunces and the two query sequences
Step 4: build the phylogenetic tree
Could anyone comment on this method? If it is not ideal, what is the standard way of preparing homolog sequences for a phylogenetic analysis?
Thank you.