Species distribution models (SDMs) are an incredibly common tool for predicting species occurrence, occupancy, or abundance on the landscape. There are many methods (Maxent, ANN, regression trees, etc.), and most authors validate their SDMs statistically (area under the curve, selection frequencies of bootstrapped runs, etc.). However, the gold standard would be:
1. Statistically validating
2. Calibrating the model using field data as related to conservation objectives (e.g. setting thresholds of 'good' habitat)
3. Verifying the model using a completely independent field-collected dataset based on the model. As in, you go out to the sites predicted to be good/bad or high/low, and measure if this is indeed the case.
I've found a couple of papers that do #2, and only one that does #3 (Johnson & Gillingham 2004 in Environmental Conservation). If you know of any papers that do #2 or #3, please let me know.
Spoiler alert: Soon, I'm going to publish an SDM that does #1, 2, and 3. But I'd like to cite other examples, if there are any.