Hello. I'm currently analyzing a plastid transcriptome produced for Karenia brevis, a fucoxanthin-pigmented dinoflagellate. My understanding of the plastid lineage for this organism, is that it contains a tertiary plastid obtained from engulfing a haptophyte. As such, plastid genes should group within or as a sister to the haptophytes, which should show a common lineage with red algae. When I looked to the literature, I found that this was the case in the studies I found that had conducted similar analyses. 

Since I had very little experience with phylogenetic analysis, I chose MEGA6, as it was recommended for preliminary work, although not sufficiently configurable to produce trees for publication. After choosing a group of taxa and gathering and aligning their protein sequences for psaA, I built a ML tree with the recommended LG+G substitution model. The resulting tree is attached as .docx named "First tree". 

I've also included a similar tree from a publication under the name "Published comparison" which displays a much more realistic branching pattern. It's a DNA tree for psbA, but I've found that my results don't change significantly when comparing my protein trees to DNA trees that I've produced. Also, the strange branching patterns are in all of my photosystem genes, but not rbcL. The tree I built for rbcL has a branching pattern that agrees with the plastid lineage for Karenia, ie similar to the trees in "Published comparison"

I'm frankly at a loss as to how to interpret this tree like it, and don't know why it's branching so strangely. If anyone can see what I've done wrong, or has any suggestions on what topics I should read up on to solve this, I'd be greatly appreciative of any help. I feel like I'm making a naive error somewhere, but I just can't find where.

Similar questions and discussions