My use case is I have a dataset consisting of stock returns of S&P 500 Index stocks over a period and I also have the Index returns over the same period. The problem is a regression problem where I am using a rolling window technique using 52 weeks of returns as in-sample and 12 weeks as out-of-sample. I am using various machine learning techniques on the in-sample data to forecast the out-of-sample and compare the results. As a second experiment I'm using Bayesian hyperparameter optimization to try and obtain a more accurate forecast. In a third experiment I would like to use a data augmentation technique from the distribution of the in-sample data together with optimized machine learning models to see whether the forecast performance improves.