I think you can check if you have spatial auto correlation in your study area, before doing your analysis, using I of Moran index for example. If it s significant you can use SAR models to take into account spatial autocorrelation, or an other model with your data.
Hi. First of all you should let us know what do you want to interpolate, what is the response variable. Also tell us waht database you have for the respons variable(s) and for the potential predictors.
Right, Cristian. My response variable(s) are elemental concentrations (mainly heavy metals for a total of 15 elements) in plant tissues sampled over a study area. Data are reported at sampling unit level, the sampling units being located on the basis of a3km sistematic grid, 42 sampling units complexively. Basically, I'd be simply interested in interpolating each element over the area. I have not so many predictors, except for the geographic coordinates, the elevation, and eventually other descriptors of the geomorphology/land use which I could obtain from thematic layers. Thanks a in advance for any suggestions
Well, heavy metals in plant tissues is not exactly my area of expertize. However, it seems to me that qualitative predictors, such as land use, maybe soil types, may also explain the spatial variability of the heavy metals concentrations, besides quantitative predictors, such as elevation, topographic wetness index or maybe the distance from pollution sources. In order to integrate qualitative and quatitative predictors, you could try to use the analysis of covariance. XLSTAT has a good covariance module.
You have already given half of the actual answer..deal with the study area as a basin or "Basin-Wise Approach" and in dealing with this complex geomorphology area as a basin or applying basin-wise approach look to know at what stage of "maturity"this basin has reachedin order to know the evolution of this basin and whether it is in "youthful stage..premature..or mature.and how much sediments were transferred out of this sediments..this sort of calculation can be achieved by running "area versus height curve or what is known as "hypsometric Annalysis".
Second and another approach which is pure statistical model through which and because of the hereogeneity of that area try to define "raw-data matrix" for all dependent variables and also the independent key varialbles.....take the highly correlated variables with R more than 65% and run a multiple regression analysis using step-wise technique i.e. starting with lowest number of dependant variable and each step you will add one by one and observe meanwhile the standard error of estimate and any other ike student t-test"..select the best equation and use it as the most fitting your complex geomorpholgy area.
I am assuming (or sure!) that by interpolation you meant spatial interpolation. I can give you some ideas from the gestatistical aspect, which I have expertise in. Recapping what Samir wrote, a multivariate analysis of the spatial variants and selecting the limiting variants thereby is the prerequisite before you apply any spatial interpolation model. This also goes back to the complex geomorphology that needs to be solved beforehand and consequently highly correlated variants should be determined. I would suggest that you run a simple multi-linear regression analysis between the geomorphological predictors and your predictand (heavy metals). Once the pattern of spatial variability would be clear, spatial interpolationn would not be a problem. It might also be interesting to see which interpolation model provides with the better cross-validation performance, therefore a preliminary evaluation is necessary.
It would be helpful (in trying to respond to your question) to know more about the problem you are trying to address. I suggest that you may want to do more exploratory analysis, e.g. histograms for each heavy metal concentration, Principal Component Analysis at least for the 15 elements, Trend Surface analyses for each element, i.e. a linear regression for each element vs a polynomial in the position coordinates (first degree and possibly a second degree polynomial), sample variograms (although with only 42 data locations the results are not likely to be very good).
If I understand you correctly, the data locations are spread rather far apart. This . Dtogether with the relatively small number of data locations makes it unlikely that any interpolation method will produce very good results.
In doing the regression, what software did you use? Any decent software should produce a number of diagnostics that should be helpful in interpreting the results or at least determining the reliability of the results. If you are not familiar with or not using the R software (open source-free) it has lots of very good tools. Do a search on Google for "R project", you can download and easily install the binaries for Windows, Mac, Unix, Linux. R also links very well with a couple of the open source GIS programs. You can find a lot of tutorial materials for R by doing a search on Google
Do you have the option or opportunity of collecting more data?
It is tempting to want a simplistic answer but I suggest that it is not the best result, you want to obtain a better understanding of the spatial distribution of the heavy metals and that can require doing various different kinds of analyses.
Here's my two cents: How do you think the heavy metals got into the plant tissue in he first place? Dust? Variation in soil parent material? The mechanism for the plant accessing any particular element should inform the kind of model that you build. For example, is the plant is getting the element via water, you want your model to look like a hydrology model but if it is carried in via dust, you want your model based on wind directions and seasonal components. I find the math, is secondary to the function.
1. Classify the trees according to landforms. e.g. Dunes, or channel bars or point bars, or pediment and so on
2. As soon as u do this processes operating in the regions also get classified.
3. need to understand buffer time (hiatus of landform). this is the time phase when erosion and pedogensis would takes place.
4. Response to growing of trees would be related to environmental conditions during this buffer time.
5 Finally these trees absorb heavies from 1-2 m of sediment profile.
6, characterise the sediment profiles (at least 2m) from each landform.
7. Now you can carry out statistical analysis to understand relationship with given landforms and population of trees (species wise as well as total population).
I feel adopting this should fish you out of complex landforms and their relationship with results you have got from analysing trees.
Thank you everybody for your help. I got a lot of hints for managing what I'm doing in a better way than I could do.
@Donald: good hint, I can hold more or less on my own with R. So, I'll review some of the methods you all are suggesting and see what I can prduce. Above all, thank you indeed for your suggestion "not to be tempted to find simplistic solution"!
@Manuel and Avit: thanks, I'll try to include the dummy variables to understand variability before trying eventually to run a spatial interpolation
@Susan: actually I'm studying the effect of a punctual source of gaseous (dust) emission, so the hypothesis dry and wet depositions from atmosphere are the main contributors.
@Dhananjay: I guess that my context is slightly different from erosion questions, but thank you for putting the "temporal" point on the table.
I think you can check if you have spatial auto correlation in your study area, before doing your analysis, using I of Moran index for example. If it s significant you can use SAR models to take into account spatial autocorrelation, or an other model with your data.