I have a multidomain protein, which has four times the same domain (among other domains). I presume that at the origin, one of these domains was present, then a second was attached to this domain, then a third domain was either attached at the end, or inserted between the two first domains. And finally the fourth one that comes somewhere.
How can I infer the history of this protein building?
This protein is from human, so I could make comparative genomics.
I have two ideas in mind:
1) Perform a Blast using the full protein length on other organisms that diverge very early ("basal" metazoan, unicellular eukaryotes, bacteria, archaea) and then perform the multiple alignment and the phylogenetic tree. Then I could if one domain was missing at some point of the evolutionary history.
2) Perform a Blast using one domain on the same other organisms that diverge very early and then perform the multiple alignment and the phylogenetic tree. I should be able to see when the domain duplicated and if some are closer to a single-domain protein.
Any other idea, suggestions, methods, comments?
Thanks,
Romain