I need to evaluate different deep learning models for one forecasting problem

First: how to select the best configuration for a specific model?

second: how to compare the different models for the task at hand?

For the first: Using training and validation data I started to train the model with checkpoints (to save only the weights when there is improvement in the val_loss). Then, I used the latest checkpoints to create a new model for the prediction

Is this a good practice?

Is there something I should consider?

For the second: I used the same grid search for all the models .. however, I faced two main problems:

  • The randomness by model nature
  • The too many hyperparameters that I need to track
  • I was reading about the randomness problem and I followed the (seed) solution suggested https://opendatascience.com/properly-setting-the-random-seed-in-ml-experiments-not-as-simple-as-you-might-imagine/. But, I am afraid this is affecting the weights initialization and hence the model performance!

    I am still not sure how to mitigate it. Is there any way other than logging and plotting to select the best candidate

    I am self-learning, so I am sorry if these questions are silly or should be basics.

    Thanks for any help

    More Sarah Almaghrabi's questions See All
    Similar questions and discussions