I am conducting an econometric study on the impact of renewable and non-renewable energy consumption on green economic growth in 7 selected countries during the period 1990–2024.
The challenge is that, for 3 Arab Gulf countries (UAE, Saudi Arabia, Qatar), renewable energy consumption values are zero from 1990 to 2008, while positive values appear only from 2009 onwards. For the other 4 countries in my sample, renewable energy data is available for the full period without this issue.
My supervisor suggests that such long sequences of zero values should not be included in the model. However, I believe they reflect the historical reality, since renewable energy was not in use during that period.
I would greatly appreciate your advice on the following questions:
Should I keep the zero values and estimate the model for the full period (1990–2024), even if the coefficients for renewable energy may turn out insignificant?
Or should I restrict the analysis to the shorter period (2009–2024) for the Gulf countries?
Are there recommended econometric approaches to handle this type of data structure, such as log(1+x) transformations, dynamic panel ARDL, or even zero-inflated models?
👉 Important: I am particularly looking for answers supported with evidence, references, or examples from published studies that have dealt with similar cases (for instance, renewable energy adoption in GCC countries or other regions where renewable energy data starts late and shows long zero periods).
References I have found so far include:
Chen & Roth (2023), Logs with zeros? Some problems and solutions (arXiv).
Bellemare & Wichman (2019), Elasticities and the inverse hyperbolic sine transformation.
Silva & Tenreyro (2006), The log of gravity, Review of Economics and Statistics.
Any suggestions with proper references would be highly valuable for my work.
Thank you very much for your time and guidance.