First question is how many data locations do you have and what is the geographical extent of the smallest region that contains all the data locations? Is it possible that you have measurement errors in the data and/or data location coordinate errors?
The sample variogram (semivariogram is outmoded terminology) does not directly estimate the variogram (as a function), it only provides estimates of the values of the variogram for specified lag distances/directions. The sample variogram is computed by pairing data locations, the total number of pairs is completely determined by the number of data locations, the pairs are then allocated to different distance classes but there is a conflict here. ideally you want a large number of pairs for each distance class but then you can only consider a small number of distance classes, i.e. plotted points on the sample variogram, this means you have less information about the shape of the variogram (fewer plotted points).
The spatial pattern of the data locations has a big effect on the sample variogram. The sample variogram is only an unbiased estimator of the variogram values if the "drift" is constant.
Have you done any exploratory analysis on the data, e.g. a coded plot of the data locations (each point coded by the data value), made a histogram of the data values, fitted a trend surface to the data? What do you know about the particular phenomenon that generated the data. Have you checked the literature for papers analysing similar data?
Sometimes it is useful to use a log transform of the data (that leads to some complications however). Other transformations are much more difficult to use, however sometimes an indicator transform is useful
Finally remember that the variogram must satisfy several important mathematical conditions and the sample variogram usually will not satisfy these.
Thanks Donald, there is a lot of thoughtful advice here for which I'm most grateful. As a geochemist/soil chemist I'm a bit new to geostatistics!
Im already transforming variables if exploratory analysis shows skewness (I'm using the 'geoR' package in R which makes this pretty easy). I've checked for location and data errors thoroughly. Possibly the main issue with the data is lack of observations; in the case in question, I have only 77 unique locations.
I've done interpolations of the data already using a cubic spline (not sure if this qualifies as a trend surface, though). I'll try making sample variograms with fewer distance classes, and also using a model with a mean trend instead of the constant mean I'm using currently.
Spatial Efficiency metric (SPAEF) is proven to be robust when comparing two raster maps. Python and Matlab codes are available at: http://space.geus.dk/tools_products/index.html