Hi everyone,
I am currently trying to analyse the effect on housing prices in a place north of Bergen, Norway called Aasane (Åsane) after the area gets a ligth rail (transportation system). I have data transactional data from where the ligth rail already exists (from Byparken to Flesland). The existing Ligth rail was built in three steps or phases. I also have data on health conditions, housing attributes such as square feet of living space, walking distance from the ligth rail and the city central etc on all observations, both those where the ligth rail exists and where it is coming.
My initial though was that I could predict the housing prices in Aasane using observations from another area (which already has the ligth rail). The problem with this idea, is that the housing prices vary between the area of my training data (Byparken - Flesland) and my test set (Aasane) for reasons that my model do not include (as I am not including variations from my variables in Aasane in the training set).
The Ligth rail was decided approved in 2000 and the first construction phase was done in 2010, the second in 2013 and the last in 2017. The ligth rail of Aasane is estimated to be done in 2031.
My questions is: Is there any ways to include the variation which is spesific from the area I want to predict for (Aasane) in my model so that the model can take this into account when predicting?
Best,
Kristoffer