I was wondering if training a neural network in the deep ensemble setting can lead to a network with a posterior vs. a point estimate architecture?

Recently there have been discussions over the interpretability of Deep ensembles as Bayesian models. This led me to this thought that whether or not we can learn a posterior at the end of training in such a scenario?

Similar questions and discussions