Hello everyone:

I have a few questions concerning the choices I have to make with respect to unbalanced panel data. I have prepared this question for Statalist.org, and therefore it is formatted in a way that State users will be most familiar (I am using StataMP 13 on a Macbook Pro).

I am dealing with a large dataset of 181 cases and 44 time-periods (n=7486). When I run the xtset command for designating the panel and time variables I am told that the panel variable is unbalanced. I know that the data is unbalanced because my independent variables have randomly missing data. I am now faced with a number of options from which I don't know how to select.

1. I have read that the use of panel corrected standard errors is suggested for panel data because such standard errors are more reliable (Beck & Katz 1995)*. The issue here, however, is that when I run my model through the xtpcse command I get the following error: "Number of gaps in sample: 70. No time periods are common to all panels, cannot estimate disturbance covariance matrix using casewise inclusion." I know what this means, but I don't know what to do about it. I have tried using the pairwise command which allows me to run the model successfully, but I don't know what types of calculation problems this may be causing. I have also repeated the pairwise approach by removing all cases with less than 5 observations, but I am still not sure as to what the problems may be with this approach. If the pairwise approach is acceptable, then what is the minimum number of observations necessary, and do these observations need to be continuous, e.g. 2001, 2002, 2003, 2004 as opposed to 2000, 2005, 2007, 2010?

2. The second option that I have followed is through the use of the xtreg command. I am familiar with xtreg and the choice between fixed-effect and random-effect models, but I am not sure if the unbalanced dataset is causing problems here as well. My question here is, which approach is better: xtpcse or xtreg, and why?

*Beck, N., & Katz, J. N. (1995). What to do (and not to do) with time-series cross-section data. American Political Science Review, 89(3), 634-647.

If you are interested in seeing the State output and a sample of my dataset please follow the link below to Statalist:

https://www.statalist.org/forums/forum/general-stata-discussion/general/1555494-issues-with-unbalanced-panel-data

More Eltion Meka's questions See All
Similar questions and discussions