Fitting a model means to find values for the model coefficients that are in some sense "good". Usually, "good" is measured by the likelihood of the observed data under the assumed model. For independent data, the likelihood is simply the product of the conditional probabilities of the observations under the assumed model. These probabilities are taken from the stochastic part of the model, i.e. from the assigned probability distribution. Many common distributions can be derived from very simple considerations and the probability laws. This way we can derive the distribution of binomial values, counts, proportions, concentrations, rates etc. The normal distribution follows from the assumption of independent differences of values from a common center, where no other information is available.
If we know that the data are counts, the conditional probability of the data (and, hence, the likelihood) should be determined using the Poisson distribution. This is actually the only difference. All other sometimes discussed differences are only technical details irrelevant to the problem. For instance, when the normal probability model is used, the (negative) likelihood coincides with the residual variance, so that the "best" values of the coefficients can be determined as "minimum variance estimates" (="least-squares estimates") instead determining the coeffitient values for which the likelihood takes a maximum ("maximum likelihood estimates").
Simple because dependent var is a count one we look to model it through count distribution like the Poisson. of course other consideration may point to other distributions like negative binomial, multinational, ..
Poisson distribution represents counting, therefore it is more preferable than nomral regression in counting problem data. However, for large lamda value, normal distribution is a good approximation of Poisson model.