Hello,

I am doing a fixed effects analysis on 650 constituencies of the UK in 4 general election years (2010, 2015, 2017 and 2019). My outcome variable is Green Party Vote Share and my independent variable is PM10 particle count (a type of air pollution proxy). I am using the percentage of residents with high education levels and median income as controls.

The fixed effects model I did turn out to be statistically insignificant. I decided to check the skewness of the outcome variable and it turned out to be skewed (more values closer to zero than not). I removed the rows with zero votes while keeping the rows of other non-zero years for the same constituency, therefore I have unbalanced panel data.

I asked my supervisor whether log-transforming the outcome variable would be useful since the model yields statistically significant results when I do, but he said it would be unnecessary. He instead suggested removing outliers (uncommonly high Green party vote share constituencies) and reporting results for that too alongside the main model.

I am still confused because a similar study by Ingmar Schumacher (2014) Article An Empirical Study of the Determinants of Green Party Voting

transformed the outcome variable into log since there is no trouble with the bounded nature of the variable. Since I removed the zero-vote rows, I think it can still work.

Can you please explain why log transform is not necessary and whether there is a better way to address the skewness?

Thanks in advance.

More Dicle Bulut's questions See All
Similar questions and discussions