I have about 10k subjects with each having 1000 observations at fixed time points(Y). I have built a model and made predictions for each subject (\hat{Y}). Now I would like to apply Deep Learning to reduce the gap between Y and \hat{Y} to improve the prediction performance. What I'm thinking is to use \hat{Y} to predict residuals Y-\hat{Y} with Deep Learning.
Basically, for Deep Learning, the input is 1000 dimensional \hat{Y}, and output is 1000 dimensional residual for each subject.
I have been searching articles and tried to find how to do it, but no clue. Does anyone have any suggestions?