For a forthcoming study on alpine taxa, I will use locality data (e.g. extracted from GBIF) associated with climate data on a 1km2 grid (worldclim). The goal is to identify whether a couple of closely related species occupy different climatic niches.
In the process of cleaning the GBIF data (removing data lacking geographical precision), I realized that only a small portion of them (max. 20%) would be appropriate (precise within 1 km2). I still have 20000 records to go through and wonder if that is worth the trouble to go one by one. To my knowledge, there is no automatic filters precise enough for this.
Note that if the Geo-localisation of an alpine species lacks precision of only a few km, then the difference in climatic conditions between the "real" locality and the locality fed (GBIF data) into the analysis may be very different because of the highly variable topography of most mountain systems, thus leading to errors. With only max. 20% of precise enough data, I am questioning the validity of the automatic filtering approach.
Thus, would you advice to live with the errors (possibly up to 80% of the data), or to verify locality data one by one (tremendous work)?