For example, there are many self-report instruments for assessing depression. Can effect sizes generated from studies employing different measures of depression be pooled to provide an average effect, or should pooling be restricted to effect sizes generated by studies employing the exact same instrument?