A two-way ANOVA is usually based on a linear model of the following form:
y = mu + ax1 + bx2 + e
in which x1 and x2 are two qualitative explanatory variables (factors). Such a linear model is based on the usual assumptions of linearity, homoschedasticity, independence and normality of data.
You could do some of the following:
1) apply the linear model to the data as they are, relying on the "robustness" of the linear model
2) transform the data to make them more amenable to be treated under the assumptions of the linear model (e.g. box-cox transformation etc ...)
3) relax the assumptions of the linear model and try a generalised linear model (with the appropriate link function), or a thouroughlly non-linear model approach
Never rely on the "robustness" of the linear model.
Interestingly, simulations show that heteroscedasticity is the much bigger problem and often a by-product of non-normality. Transformation often does the job, as you discovered, as does a generalized linear model. I recommend familiarizing yourself with the latter because it's likely to become the norm in non-statistical fields over the next 10 years.