I am so incredibly confused by PAML. Over the last week I've read what feels like half the internet to get a good grasp on how to use it. I'm still stuck, despite using a combination of different codeml.ctl file layouts to try and dig into my problem.

My problem is that I only have 4 sequences for each gene I am looking at. Two of these sequences are from the same species, and I want to look at whether their dns/ds ratios vary (either site specific or branch specific). The other two sequences are quite distantly related.

I have asked both the null hypothesis and an alternative for the whole tree, and the omega's are significantly different, which means that there are different omegas over the tree.

I then have tried to label the split between my species of interest as #1, and calculate a dn/ds specific to those two branches. The more I look at them, the more I think I should be using a ML method to compare the two.

Is the case simply that my original 4 sequences are too divergent, and I can't ask anything using PAML, and will need to use a different method (such as HYPHY?)?

Or am I overlooking something in my output that should be telling me what I'm looking to know? Or asking the wrong questions in the first place.

I see suggestions for attaching the codeml.ctl file you are using to help with the troubleshooting, but honestly so far I have tried such a variety (with setting/estimating omega values, using different models (0, 1, 2)  and trying to vary the NSsites (0, 2, 7, 8).

My supervisor is suggesting that I follow the methods of two other papers, but both use more sequences and are more closely related to the outgroups.

TIA

Similar questions and discussions