It is fairly disheartening to observe unpredictable held-out set metrics on my task, early on during training. Is large, random variation in the early performance of CNNs with randomly initialised weights indicative of poor suitability to the task? If left to train for a time, would the randomly initialised networks converge?

More Matthew Gadd's questions See All
Similar questions and discussions