First question is how many data locations do you have and what is the geographical extent of the smallest region that contains all the data locations? Is it possible that you have measurement errors in the data and/or data location coordinate errors?
The sample variogram (semivariogram is outmoded terminology) does not directly estimate the variogram (as a function), it only provides estimates of the values of the variogram for specified lag distances/directions. The sample variogram is computed by pairing data locations, the total number of pairs is completely determined by the number of data locations, the pairs are then allocated to different distance classes but there is a conflict here. ideally you want a large number of pairs for each distance class but then you can only consider a small number of distance classes, i.e. plotted points on the sample variogram, this means you have less information about the shape of the variogram (fewer plotted points).
The spatial pattern of the data locations has a big effect on the sample variogram. The sample variogram is only an unbiased estimator of the variogram values if the "drift" is constant.
Have you done any exploratory analysis on the data, e.g. a coded plot of the data locations (each point coded by the data value), made a histogram of the data values, fitted a trend surface to the data? What do you know about the particular phenomenon that generated the data. Have you checked the literature for papers analysing similar data?
Sometimes it is useful to use a log transform of the data (that leads to some complications however). Other transformations are much more difficult to use, however sometimes an indicator transform is useful
Finally remember that the variogram must satisfy several important mathematical conditions and the sample variogram usually will not satisfy these.