I am using a reversible-jump MCMC (rjMCMC) in BayesTraits v3 to estimate the value of 8 parameters (Pagel's evolutionary transition rates among all four possible combinations of two binary characters). Besides proposing different rate values in each step, the chain can also propose four moves that change the number of parameters in the model: merge two parameters (reduces # of parameters), split two parameters that were previously merged (increases # of parameters), assign a value of 0 to a parameter (reduces), reassign values higher than 0 to a parameter (increases). All in all, there are potentially 21,146 different models for the chain to explore. This analysis is described in detail in Pagel & Meade (2006, Am Nat 167: 808–825).
When running the analysis with my data (~460 species), I keep getting results that are highly inconsistent with the ones from maximum likelihood (ML) and regular MCMC analyses (Figure 1). Running ML and MCMC analyses with no restrictions (8-parameter model), one can identify at least three groups of parameters (one of them, q24, a lot higher than the others). In the rjMCMC, on the other hand, a model with one parameter is sampled with a posterior probability of >70% (a model in which q12 = 0 and the other parameters are equal to each other).
Figure 2 shows the rjMCMC likelihood traceplots using 15 different phylogenetic trees (they all show basically the same pattern). Red points represent 1-parameter models, blue = 2-parameter, purple = 3-parameter. Green dashed lined represents the ML model. Notice how any increase in the number of parameters leads to a significant increase in likelihood, but the chain returns to the 1-parameter model and stays there most of the time (quite far from the maximum likelihood). It seems as if something is punishing too harshly models that are more parametrized, and that prevents the chain from exploring higher-likelihood areas of the universe of possible models.
Thus, I would like to ask: has someone ever faced similar issues with rjMCMCs? Is there any way to solve this in BayesTraits? Although I suspect this problem might be a bit too specific, I thought I would give ResearchGate a shot, after unsuccessfully trying all other resources. Needless to say, any help would be much appreciated!