Not sure what do you mean by best. One measure is Hedge's g which is a standardized difference between population means. But yo can see this recent paper by Safran et al 2012 proposing an alternative: http://www.zoology.ubc.ca/~toews/graduate_student_files/Safran_current_zoology.pdf
Nevertheless, if your ultimate interest is on the genetics (altough you just have phenotypes) of the QT I would suggest the use of Qst which is a standardized measure of the genetic differentiation of a quantitative trait among populations. You can use intra interpopulation phenotypic data under the adequate design in order to estimate the necessary additive genetic variances for Qst computaton.
The key question on this is how accurate are the measurement of the phenotypes. I agree that standardizing the data can help to do the stats, but if you are not able to consistently capture the trait variation expressed, then your analysis could fail in determining the true genetic associations among genetic units.In this case replication can help you to increase resolution in measuring trait variation and it can also give you insights about sampling variation.