Does anyone have any hints on spatial statistical analysis of poorly conditioned data?

First question is how many data locations do you have and what is the geographical extent of the smallest region that contains all the data locations? Is it possible that you have measurement errors in the data and/or data location coordinate errors?

The sample variogram (semivariogram is outmoded terminology) does not directly estimate the variogram (as a function), it only provides estimates of the values of the variogram for specified lag distances/directions. The sample variogram is computed by pairing data locations, the total number of pairs is completely determined by the number of data locations, the pairs are then allocated to different distance classes but there is a conflict here. ideally you want a large number of pairs for each distance class but then you can only consider a small number of distance classes, i.e. plotted points on the sample variogram, this means you have less information about the shape of the variogram (fewer plotted points).

The spatial pattern of the data locations has a big effect on the sample variogram. The sample variogram is only an unbiased estimator of the variogram values if the "drift" is constant.

Have you done any exploratory analysis on the data, e.g. a coded plot of the data locations (each point coded by the data value), made a histogram of the data values, fitted a trend surface to the data? What do you know about the particular phenomenon that generated the data. Have you checked the literature for papers analysing similar data?

Sometimes it is useful to use a log transform of the data (that leads to some complications however). Other transformations are much more difficult to use, however sometimes an indicator transform is useful

Finally remember that the variogram must satisfy several important mathematical conditions and the sample variogram usually will not satisfy these.

Andrew W. Rate

Thanks Donald, there is a lot of thoughtful advice here for which I'm most grateful. As a geochemist/soil chemist I'm a bit new to geostatistics!

Im already transforming variables if exploratory analysis shows skewness (I'm using the 'geoR' package in R which makes this pretty easy). I've checked for location and data errors thoroughly. Possibly the main issue with the data is lack of observations; in the case in question, I have only 77 unique locations.

I've done interpolations of the data already using a cubic spline (not sure if this qualifies as a trend surface, though). I'll try making sample variograms with fewer distance classes, and also using a model with a mean trend instead of the constant mean I'm using currently.

Mehmet Cüneyd Demirel

Spatial Efficiency metric (SPAEF) is proven to be robust when comparing two raster maps. Python and Matlab codes are available at: http://space.geus.dk/tools_products/index.html

Two papers on the metric:

https://www.hydrol-earth-syst-sci.net/22/1299/2018/hess-22-1299-2018-assets.html

https://www.geosci-model-dev-discuss.net/gmd-2017-238/

Can anyone tell me how robust conventional ANOVA is to data which may be non-ideal?

Sounds very interesting! - is there a paper published based on this work?

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How are iso-frequency contours plotted?

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Why does my protein refolded to beta sheet during thermal denaturation analysis?