Hi everyone,
I am trying to assess the chance for an individual to fall into the unemployment using LFS micro-data. The issue of selection bias emerged when restricting the sample to only individuals in the labour force (employed/unemployed). After consulting the literature, I found that even those who are out of labour force have to be included in the analysis to avoid the selection bias issue as suggested by (Heckman, 1979). Therefore, the model that I am thinking of doing is as the following:
•LFP= X1i β1+ E1i (1) first stage
•UN= X2i β2+ E2i (2) second stage
•Where LFP is the probability for individual i to enter the labour force, UN is the probability for individual i to fall into the unemployment, Xi represent the exogenous variables (demographic and socio-economic factors), β1,2 are parameters to be estimated, and the Ei1, Ei2 are the error terms.
Now, the questions are:
1- Do I have to have an instrumental variable? If so, could you please suggest me one. (I read that it should not be correlated to the unemployment probability in any way. In other words, Should affect the participation in the labour market, statistically correlated, but does not affect the unemployment, statistically uncorrelated.)
2- How does the inclusion and exclusion work. I have come across some papers that research include and exclude variables (not sure if in the main model or the probit and i do not know why).
Any suggestion or answer to the questions is greatly appreciated.