Yes, the number of time units to 1 was set. I am not sure that dataset shift may play a role in this case. I am quite new in neuron network. I observed this feature in the linear regression, multilayer perceptron, and SMOreg. I show a result for example as attached file below.
You need to change the value 1 into the number of units that you tend to examine. The result obtained from the test set is the one needs to be considered when evaluating the model; hence, make sure it shows low RMSE values.