Assume you have lots of data measured by weather stations, but unfortunately their temporal coverage is not sufficient to compute a full 30 years climatology (e.g. over the 1991-2020 period). However, you still want to compute some reference for the station which will allow one to get an idea on how warmer/colder or drier/wetter a certain period (week, month, season, year...) was.
I came up with a cool trick, that I believe someone else in the literature used, although I could not find evidence anywhere. Using reanalysis (ERA5-Land) data on a period where I have coverage from both model and station data I can attempt to build a relationship that links the two: can be something simple from a linear regression to something more complicated like find a SVM model that fits the closest grid points values from the reanalysis to the station data.
Once this "model" is found, I can use it on the reanalysis data of the period 1991-2020 to get, as output, a "fictitious" climatology for the station. This works pretty well for temperature, as it has a clear seasonal cycle and no distinct day-to-day variability, but fails completely to capture the precipitation sudden changes, maybe also due to the fact that reanalysis are hardly capturing local precipitation features.
Does someone have any literature suggestion that could make me improve the model?