Hi everyone,
I'm trying to fit certain data using nonlinear least squares which is of the following form
y(t) = a1*X1(t)+a2*X2(t)+…an*Xn(t)+b1*Y1(t-1)+..+bk*Y(t-k)
Where a1,..an and b1,..bk are the parameters, which I want to find during training the data. I have around 5000 sample points (X,Y), which varies over time. The curve fits very well during training. My R-square value is 98%, but the major issue is with validation. I used the same data, which I used in training to validate the model. I knew this is not the right process to validate the model, but I did to make sure that I can reproduce the same R-square with same data. I observed that my R-square drops drastically at least 20% for the same data. The reason I found is while validating the error from past outputs, y(t-1) is taken in to account at y(t) and also during training, the feedback output is free from error . The error keeps on growing and results in a drop of R-square value. In other words, training the data is more like an open loop, but during Validation, it turns in to real closed loop. Is there some way to accommodate this effect in optimisation function during training the data? so that I can have the same R-square value when I train and validate with the same data set
Thanks!