I have been working on a study trying to estimate body mass in mammals using a skeletal proxy in R. My dataset has high phylogenetic signal (Pagel's Lambda = 0.9) and I am interested in removing the effects of phylogenetic signal to produce more accurate estimates of body mass.

I had tried to do this using PGLS, but PGLS ended up producing higher error rates than the same data under OLS despite the high phylogenetic signal. Part of this appears to be because there are several particular clades with long branches (e.g., Monotremata) that form regression lines above or below the line of best fit, and their phylogenetic position distorts the resulting best-fit line. I had thought PGLS took phylogenetic covariance into account when estimating values, but it looks as though it only uses phylogenetic signal into account to minimize the residuals of the best fit line. It does not put the phylogenetic covariance back into the estimation of the y-value.

I.e., the function does not go: "Taxon X is positioned as sister to Monotremata. Therefore it should deviate above or below the regression line to a degree similar to its phylogenetic distance and the mass estimate should be adjusted accordingly?"

Given this, is there any way to use a topology to estimate a value that considers phylogenetic covariance and the phylogenetic postion of the unknown taxon when estimating this value, rather than just using it to minimize the residuals when calculating the best fit line?

More Russell Engelman's questions See All
Similar questions and discussions