Geostatistics follows the principle of distance decay relationship among the continuous variables. But if the variables are discrete (I am not sure if it's appropriate) categorical or somehow generalized, there should be an application too.
Indicator geostatics is a suitable approach. But often, a "nearest neighbor" interpolation (if needed with different weights for different directions in the calculation of the distance) does just as fine. See: DM Tartakovsky, B Wohlberg, A Guadagnini: Nearest-neighbor classification for facies delineation, WATER RESOURCES RESEARCH 43(7): W07201 DOI: 10.1029/2007WR005968 Published: JUL 20 2007
You can perform interpolation on sample points of categorical variables using indicator geostatistical approaches. You can find the pdfs of the slides of a geostatistical course in this site: http://www.dpi.inpe.br/spring/english/manuals.html . The presentation number 8 of this course contains the theory to interpolate categorical samples with indicator kriging and simulation. In this course you have also laboratories with sample data and instructions to run geostatistical predictions using the SPRING GIS that is available freeware in www.dpi.inpe.br/spring
Thanks everyone for the very useful suggestions. Going through your answers and further literature review made me believe that Indicator Kriging is a highly suitable approach for interpolating categorical data. I haven't applied IK so far, but there have been a whole range of applications of it in determining climatic stages (for example dry-wet stages defined by precipitation) and land cover types. This also goes back to the Logistic Regression suggested by Duccio. If a logistic binary indicator can be set based on the category definition, IK should work. I will try to develop the algorithm and keep you all updated.
1. Discrete random variables and categorical variables are not the same thing, "discrete" refers to the set of possible values of the random variable, e.g Binomial is discrete. Categorical is really a different quantity and may vary in type. In some cases the categorical variable is simply yes, no and for those Indicator kriging can be a good choice. Note that "stationarity" would pertain to the indicator variable not to the categorical variables,
2. If you have both numerical valued variables and categorical then the categorical might be handled via "external drift:, i.e. essentially a regression on the categorical variables to model the non-stationary mean.
3. Another possibility for categorical variables is to use geographically weighted regression but again you are essentially modeling the non-stationary mean.
4. If your categorical variable has more than "yes no" possible values then the problem is not so simple. E.g suppose you consider the variable "land cover: which in most cases will have three or more possible values. One might suppose that if you transform to an indicator with values 1,2.3,... it would be okay but there is no sense of an ordering on the land cover values whereas the transformed values definitely do. Moreover the transformed values have numerical relationships that are likely not correct representations of the land cover values. Then there is the question of how to interpret the results after some form of interpolation on these transformed values. In the case of only two values the kriged indicator values can be interpreted as probabilities.
It depends a lot on what the application is and how much you understand of the phenomena that supposedly generates the data. With numerical data the idea of "support" is much clearer but not so for categorical variables.