I was doing some geostatistical analysis (variogram+kriging) for a "presence only" type data in a species distribution modeling context. Since, we know that when estimating the (empirical) variogram, the attribute is basically assumed to be a realization of continuous random variables (although an attribute can occur in counts too). If the attribute is just the presence, and no sub-categories then all the values at all positions will be same (say 1, if we denote a presence by 1). Hence the variogram can not be calculated, not even the indicator variogram. In some papers such as [1] and references there in, a grid based approach was used. In this approach a grid of certain size (e.g. 10 x 10 m etc) was superimposed on the sampling area and the number of species inside each cell were counted. This constitutes a count/frequency table like data. In the other approach pseudo absences or background data were generated using some algorithm e.g. Maxent etc (see e.g. [2, 3]). The pseudo absences are generated taking many factors into account and stacked/combined with actual data. This is merely generating x, y coordinates and giving it an absence status (say 0s). The result is a binary data with two categories, presence 1 and absences 0.
Now the questions that are bothering me are
1. For the grid based approach, what should be the optimal cell size? How to find it and decide it? How to proceed with variogram with kriging etc?
2. For pseudo absences/background approach, how many absences (as compared to actual data)? How to decide it? How to proceed with variogram with kriging etc?
Reference
1. Rossi, Richard E., et al. “Geostatistical Tools for Modeling and Interpreting Ecological Spatial Dependence.” Ecological Monographs, vol. 62, no. 2, 1992, pp. 277–314. www.jstor.org/stable/2937096.
2. Tomislav Hengl, Henk Sierdsema, Andreja Radović, Arta Dilo, Spatial prediction of species’ distributions from occurrence-only records: combining point pattern analysis, ENFA and regression-kriging, Ecological Modelling, Volume 220, Issue 24, 24 December 2009, Pages 3499-3511.
3. https://cran.r-project.org/web/packages/dismo/vignettes/sdm.pdf