There are two ways to derive Boltzmann exponential probability distribution of ensemble:
1) Microcanonical Ensemble: We assume a system S(E,V,N)
E= internal Energy, V=volume, N=number of molecules or entities.
We have different energy states that the molecules can take, but the total energy E of the system is fixed. So whatever be the distribution of molecules in different energy levels, the energy of the over all system is fixed. Then we find the maxima of Entropy of the system to find out the equilibrium probability distribution of molecules in energy levels. We introduce two Lagrange multipliers for two constraints: total probability is unity and total energy is constant E. What we get is an exponential distribution.
2) Canonical Ensemble: We have a system with N molecules. The Helmholtz energy is defined as F=F(T,V,N). So this time energy is not fixed but the temperature is. Instead of different energy states for the molecules, now we have different energy levels of the entire system to be. So by minimization of F we get the equilibrium probability distribution of the system to be in different energy levels. This time the constraint is total probability is unity. The distribution we get is an exponential one.
Now the question is:
How can the probability distribution of the canonical ensemble can give population distribution of molecules in different energy states which is rather found from micro-canonical ensemble?
In the book Molecular driving forces (Ken A Dill) Chapter 10. Equation 10.11 says something similar.