I am trying to see trend of climatic variability in my study area. The problem is that there is no meteorological station. Therefore I tried to collect temperature data from nearby stations but I got only 4 stations which provide temperature data.
The first two questions I would ask regarding the problem are: what is the size of your study area? and what are the distances of the neighboring stations from your study area? The embedded question to the first one would be how are you representing your study area? By a point or by a polygon (area)? The proper method identification is underlying all these questions. Don't forget that the realization of the climatic phenomena at the four neighboring points represent the variability at only those stations. So, if your study area is a polygon and the neighboring stations are too far from the border, I would rather suggest to extract the climate signal for your study area from the global climatic models data. If you represent your study area as a point and the neighboring stations are at satisfactory spatial lags (or representing area where the neighboring stations are at the border) then spatial interpolation might help you. It can be a simple IDW interpolation or complicated kriging with external drift where you take the lapse rate effect of temperature and windward and leeward precipitation effects into account. The simplistic approach would be to make a geographically weighted average of the trends for your study area while representing your study area as a point or considering that average as your regional average. However, whatever the method you choose, it's very important that you are aware of the associated uncertainty with it. For example, variogram with four spatial points would definitely not be a precise one (or rather there would be no variogram). I am not sure how long your time series is, but probably you can compensate the lack of spatial points, please have a look at Christakos 2000: Modern Spatiotemporal Geostatistics. Oxford University Press for details.
In a nutshell, there are bunch of techniques available to solve your problem, you need to decide what to choose based on the characteristics of your study region and being aware of the uncertainties that come with the chosen method.
If you want to make a series of estimates over the years and a single point ( centroid/geometric centre) of the area using the records of the four stations, I think whatever Roland Kröbel has suggested is a good idea using a deterministic method like IDW to which he has hinted.
In case the difference in elevation of four stations and the location for which you want to interpolate is not much (< 100m), you can use distance as the weighting factor for interpolation of temperature (as in inverse square distance interpolation or krigging). But if there are significant elevation differences then you have to correct for these differences also by using lapse rate correction.
I agree with all written above. But then again, you should consider the loss of resolution when interpolating the data using (relatively) distant stations. The local temperature (on the scale of, say, 0-100m) is controlled by the micrometeorological factors, which only local measurements can "see". Thus, depending on the distance to the nearest station, you will be losing resolution, so that with increasing distance, only hourly, daily, weekly etc. averages would be representative.
treat this as a sub question to what Nani had asked.
Can we really go for interpolation of temperature data by simply considering the the nearby stations? What Vinay Sehgal pointed out is one factor. But, temperature also depends lot other factors - say micro climatological factors as Pavel has indicated. The local rainfall, hence the vapour pressure and humidity, altitude, land use and land cover type, solar radiation and sunshine hours and may be wind also. Lot many factors determines the temperature.
So, unless we do a (simple) modeling to find the degree of correlation of these factors.. can we really go ahead?
That depends upon type of usage. How accurate you want to be? Yes modelling will be best but if someone wanted a rough estimate then simple interpolation will do.
We should understand that interpolation of data is no substitute for in-situ measurement. Interpolation entails error and we have to minimize those errors. I agree with Ashok that it all depends on the purpose for which data is to be interpolated. E.g. if I'm interested to run a crop model with interpolated temperature then I need to correct for local micrometeorological conditions. As here the aim is to study climatic variability, I suggest that interpolation of temperature shall take care of distance of stations from the location and all stations and locations should not be separated by any natural barrier (hill range, sea, lake, big cities etc.).
In my opinion, if you want to see the trend of long term climate variability, you can easily go without interpolation, just by analysing trends in your four reference stations. The trend depends mostly on global- or mesoscale factors and I think interpolation won't add any new information to it. In the other hand, if you need to analyse impact of some local factors on the climatic conditions in your study area - e.g. impact of land cover change, etc., simple interpolation won't do and you should go with more sophisticated model.
The first two questions I would ask regarding the problem are: what is the size of your study area? and what are the distances of the neighboring stations from your study area? The embedded question to the first one would be how are you representing your study area? By a point or by a polygon (area)? The proper method identification is underlying all these questions. Don't forget that the realization of the climatic phenomena at the four neighboring points represent the variability at only those stations. So, if your study area is a polygon and the neighboring stations are too far from the border, I would rather suggest to extract the climate signal for your study area from the global climatic models data. If you represent your study area as a point and the neighboring stations are at satisfactory spatial lags (or representing area where the neighboring stations are at the border) then spatial interpolation might help you. It can be a simple IDW interpolation or complicated kriging with external drift where you take the lapse rate effect of temperature and windward and leeward precipitation effects into account. The simplistic approach would be to make a geographically weighted average of the trends for your study area while representing your study area as a point or considering that average as your regional average. However, whatever the method you choose, it's very important that you are aware of the associated uncertainty with it. For example, variogram with four spatial points would definitely not be a precise one (or rather there would be no variogram). I am not sure how long your time series is, but probably you can compensate the lack of spatial points, please have a look at Christakos 2000: Modern Spatiotemporal Geostatistics. Oxford University Press for details.
In a nutshell, there are bunch of techniques available to solve your problem, you need to decide what to choose based on the characteristics of your study region and being aware of the uncertainties that come with the chosen method.
Following the discrimination suggested by Avit between area and point study, in the second case one could operate as follows. Climatic temperatures of a station, for a given season and latitude, depend upon different geostatistical and physical parameters: the eight above sea level, the sea distance, the orientation in case of sloping, the height relative to the valley bottom (this is important for minimum temperature in case of nocturnal inversions); also radiative properties of soil could be important, like the surface emissivity. Then by using 4 nearby meteorological stations all that must be considered. For instance in case of flat orography an Inverse Distance Weight could be sufficient. On the contrary in case of complex orography, a more sophisticated method is necessary, that takes in account all the geostatistical variables. If the four stations have sufficient differences among them in height above sea level (a few hundreds meter or more), you can try to establish a statistical relationship between temperature and height, for instance by linear regression. Then you can apply the relationship to the site. If the site is located in a valley bottom the effect on minimum temperature in case of nocturnal inversions is important. If one of the four stations is located in a valley bottom too, but some of the other is not, you can estimate the effects of nocturnal inversions. An important step is to evaluate the interpolation errors. For instance three stations could be used for interpolation and the fourth one to estimate the errors. It is very difficult (perhaps impossible) to have mean absolute interpolation errors below 0.5°C, for minimum temperatures it is very difficult also to have mean absolute interpolation errors below 1°C. Now it is necessary to evaluate if the interpolation errors are compatible with the trend value. Low trend values can not be correctly estimated by using interpolation data.