I need to evaluate different deep learning models for one forecasting problem
First: how to select the best configuration for a specific model?
second: how to compare the different models for the task at hand?
For the first: Using training and validation data I started to train the model with checkpoints (to save only the weights when there is improvement in the val_loss). Then, I used the latest checkpoints to create a new model for the prediction
Is this a good practice?
Is there something I should consider?
For the second: I used the same grid search for all the models .. however, I faced two main problems:
I was reading about the randomness problem and I followed the (seed) solution suggested https://opendatascience.com/properly-setting-the-random-seed-in-ml-experiments-not-as-simple-as-you-might-imagine/. But, I am afraid this is affecting the weights initialization and hence the model performance!
I am still not sure how to mitigate it. Is there any way other than logging and plotting to select the best candidate
I am self-learning, so I am sorry if these questions are silly or should be basics.
Thanks for any help