I have a count data model with panel data and I would like to decide between fixed and random effects. But the value of the Hausman test is negative (p value = 1). How can it be possible? Might it be due to the existence of outliers?
If I understand correctly, your problem is that the test statistic is negative, though it is supposed to have a Chi-squared distribution the support of which is R+.
If this is the case, the answer is simple: reverse the position of the two models within the formula.
Thanks a lot Attila!! I have tried out your suggestion, and the statistic is now positive. However, I´m not sure if that is technically correct. According to the Stata help: "The order of computing the two estimators may be reversed. You have to be careful, though, to specify to hausman the models in the order "always consistent" first and "efficient under H0" second. It is possible to skip storing the second model and refer to the last estimation results by a period (.)."
According to the Hausman-formula, the only reason why the Hausman test can get negative is because the parameter estimate of b1 has larger variance than that of b0.
Now the H0 of the whole test is that b0 and b1 are consistent, with the alternative that b1 is not. You do this test when you have an estimate (b0, I am guessing that was the FE) which you know to be consistent, and another (b1, RE) the consistency of which you are not entirely sure about, but it is more efficient so you would be better off using it if was OK.
In your case the thing is that you are testing the consistency of b1, conditioned on the fact that b0 is not just consistent but it is also more efficient (since it has less varience, that's why Var(b0)-Var(b1) turned out to be negative.
with FGLS estimates, RE does not necessarily have the smaller variance; such things may happen (however, I would have expected a missing rather than a negative value for the statistic)...
Another thing is that, if your sample is rather small (and in particular the time-dimension short), the FE variance might be underestimated: the default in STATA for "xtreg" is the asymptotic estimate; you could try to use variance estimates with small-sample correction instead.
The covariance matrix for the Hausman test is only positive semi-definite under the null. It also does not necessarily have the obvious degrees of freedom. Take a simple example. Consider adding an additional variable to an OLS regression. The initial coefficients are consistent and efficient if the coefficient on the additional variable is 0. They are consistent if the new variable is orthogonal to the original variable. Under the null of a zero coefficient, the difference between the covariance matrix with the additional variable and the covariance matrix without this variable is positive semi-definite, but if, in fact, the additional variable has a lot of explanatory power, the standard errors on the original coefficients can come down. Moreover, as this example shows, Hausman tests can often have fewer degrees of freedom than the rank of the covariance matrix and therefore end up being positive semi-definite rather than positive definite under the null. If memory serves me, Hausman discusses this for IV in his original article. I think Paul Ruud had a good review article sometime in the early to mid 1980s and the general issue is discussed in Davidson and MacKinnon's text.
I have this problem when doing hausman test.First I had 5 independent variables but fixed effect model has omitted one variable which present in random model. But that omitted variable is one of the most important variables among my independent variables. following capture shows what I obtained in hausman test and I am confused with this. How I can interpret this?? it is a negative chi square. what is wrong and how I can overcome this problem?
Pravesh Raghoo, regarding what to do next depends on for what purpose are you using the Hausman test. If it is for IV or endogeneity, then you can estimate the first stage and get the residuals. Then estimate the original regression and include the first stage residuals as an additional regressor. Then test null as B^residuals=0.
The idea behind doing this is that if an explanatory variable is exogenous then the first stage residuals should have no predictive power for y in original regression, because they are just noise in the determination of the exogenous variable.