This is a technique to measure the effect of the treatment through the use of “logistic regression” with a categorical dependent variable Y where Y = 1 if participate, and Y = 0 otherwise. First run a logistic function:
(1) F(t) = 1 / (1 – exp(-t)
… where t = linear function in a form of:
(2) t = a + bX
… where a = y-axis intercept, b = slope; and X = independent variable. Note t is not time in this equation but t is equivalent of "y" in the conventional y = a + bX.
Note that F(t) produces a curve in a shape of an S. This is also known as a normal growth curve. In order to run logistic regression above, one must estimate the linear function (2) first.
Next obtain propensity score by calculating the probability of the event. If using the Laplace equation, the probability of success is given by:
(3) P(s) = (s + 1) / N + 2
… where s = success or participation observed, and N = number of total observed events. One the propensity score is obtained, match it to the nearest non-participant score. Non-participation in this case is the probability of failure:
(4) P(f) = 1 – P(s)
The next step is to use the difference-in-difference estimator’s 2 X 2 table. See below.
DIFFERENCE IN DIFFERENCE (DID)
This is a technique used to measured change induced by a treatment or stimulus through the use of 2 X 2 table. The general model is given by:
… where Y(ist) = independent variable of i term creted by s = state and t = time; gamma(s) = vertical intercept of s state and lamda(t) = vertical intercept of t; delta = treatment effect; and D(st) = dummy variable.
Assume s = 1, 2 and t = 1, 2. The matrix of the mean difference of the 2 X 2 table is:
assuming that D22 = 1 and D11 = D12 = D21 = 0, where d^ is the effect of the treatment of D(st). This value of delta should be non-zero.
If delta = 0, it means that the treatment has no effect
If delta > 0 or < 0, use test statistic to determine the level of significance. Use the following guideline in selecting appropriate test statistic:
(a) t-test: sample test
(b) Z-test: normal population test
(c) Chi-square test: small sample that is non-normally distributed
(d) F-test: two populations or two small samples which are non-normally distributed.
REFERENCES:
-Abadie, A. (2005). "Semiparametric difference-in-differences estimators". Review of Economic Studies 72 (1): 1–19.
-Angrist, J. D.; Pischke, J. S. (2008). Mostly harmless econometrics: An empiricist's companion. Princeton University Press. ISBN 9780691120348.
-Bertrand, M.; Duflo, E.; Mullainathan, S. (2004). "How Much Should We Trust Differences-in-Differences Estimates?". Quarterly Journal of Economics 119 (1): 249–275.
-Card, David; Krueger, Alan B. (1994). "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania". American Economic Review 84 (4): 772–793.
-Pearl, J. (2009). "Understanding propensity scores". Causality: Models, Reasoning, and Inference (Second ed.). New York: Cambridge University Press. ISBN 978-0-521-89560-6.
-Pearl, J. (2000). Causality: Models, Reasoning, and Inference. New York: Cambridge University Press. ISBN 0-521-77362-8.
-Rosenbaum, Paul R.; Rubin, Donald B. (1983). "The Central Role of the Propensity Score in Observational Studies for Causal Effects". Biometrika 70 (1): 41–55.
-Shadish, W. R.; Cook, T. D.; Campbell, D. T. (2002). Experimental and Quasi-experimental Designs for Generalized Causal Inference. Boston: Houghton Mifflin. ISBN 0-395-61556-9.
Hey Taofeek, what you need is a sample of individuals that have obtained microcredits and a sample that have not obtained microcredits but are eligible. Then, hope that you are able to explain the selection of individuals in microcredits (ie, the propensity score should have explanatory power). Finally compare the results (whichever you like) for individuals that have obtained microcredits with (based eg on the propensity score) similar individuals that have not obtained microcredits. Le voila! (at least if the CIA holds, ie if there is onel selection on observables). psmatch2 (in Stata) does this for you. However, under these conditions a regression will also be consistent (though not efficient). Best, Alfred