I propose that many errors may arise due to the incompatibility of different datasets. What we can do in this case? I suggest two approaches to the problem solving (you can find these articles in my ResearchGate site):
Eppelbaum, L., Eppelbaum, V. and Ben-Avraham, Z., 2003. Formalization and estimation of integrated geological investigations: Informational Approach. Geoinformatics, 14, No.3, 233-240.
Technological resources get increasing amounts of geographic information and geometric precision in data collection. However, its use may be implying inadequate scales and forms of representation.
To resolve this problem in "GEO - bigdata" it's necessary to find ways to improve the management of metadata (open data), use of standard data formats (OGC) and even make processes of generalization with open libraries (open source) or algorithms able to generalize and adapt different data sets for the uses proposed by each user and each place. (task very complex...)
As an example propose a contribution to the study of the precise measurement of the changes in the area of occupation of endemic and rare vegetal species with methods of spatial sampling, scale changes and cartographic generalisation to support sustainable urban planning:
Zaragozi, B. Gimenez, P., Navarro, J.T., Dongb, P. and Ramón, A.:2012. Development of free and opensource GIS software for cartographic generalisation and occupancy area calculations. Ecological Informatics Volume 8, Pages 48–54
Spatial data sets are produced at multiple scales for satisfying different requirements. We normally expect higher accuracy and higher level of detail (LoD) from larger scale (or higher resolution) data sets. If the heterogenous data sets are combined in this respect, it will yield uncertain results, for example, in spatial analysis from geometric and semantic aspects. For further information you may refer to the relevant books that can be found on the following links.
Data sets compiled at different spatial resolutions will have different levels of spatial precision. If you compare, say, two rasters with different resolutions, you can't have any certainty that what the finer raster describes in a given pixel actually relates to what the coarser raster describes in the same spot, because what the coarser raster describes comes from a different support (i.e., the particular size and orientation of the sampling strategy). The coarser pixel might have a value that is heavily influenced by a statistical outlier that actually happens elsewhere in the pixel rather than the area where it overlaps with the finer pixel.
Similar issues exist with vector data. They are compiled to different levels of spatial detail, which still translates to a spatial resolution (e.g., how close together on average are the vertices in the data). A given river line will be a complex line if modeled at high resolution, and considerably simpler and straighter at low resolution. Both of these model the same real-world river, but the lines have two very different lengths, sinuosity, etc. They have different levels of uncertainty/error. Thus, if you measure something related to river length (e.g., water discharge rate) from the high resolution line and relate that to something from the low resolution line (e.g., a historical water discharge rate measured older, coarser data), you will be comparing numbers whose error brackets are possibly very widely different. The margin of error of one, for example, might be very much bigger than the resolution, let alone margin of error, of the other!
In general, the appropriate procedure when you must use data across different levels of resolution is to generalize the higher resolution data to the level of detail of the coarsest dataset you must use, even though how exactly to do that is not always straight-forward.