I am performing a cross-country regression analysis with a sample of 101 countries. Most of my variables are averages of annual data across a period of 7 years. Every one of my primary variables has data available in each of these 7 years. However, certain countries have data missing in certain years for variables used in my robustness checks.

How should I handle this missing data for each robustness variable? Here are a few ideas I have considered

A. Average data for each country, regardless of missing years

B. Exclude any country with any missing years from data for that respective variable

C. Exclude countries that are missing data up to a certain benchmark, perhaps removing countries that are missing more than 2 or 3 of the 7 years that are being averaged for that respective regressor

D. Only use robustness variables that have available data for every country in every year that is being averaged

Please offer the best solution and any other solutions that would be acceptable.

More Zachary Brower's questions See All
Similar questions and discussions