Hello,
I am a doctoral student and currently working on a mediation model where X is the treatment, M is the mediator and Y is the outcome variables. I also use Z as the single instrumental variable in the model. I use ivmediate command in Stata 14.2 (Dippel et al, 2020*).
Even though I have promising initial results, I am concerned to see a high (and significant at 5%) correlation between M and Y. Dippel and fellows (2019)** explain in another paper (where they discuss the underlying theory behind ivmediate command) that they allow error terms of X&M, M&Y and X&Y to correlate conditional on error term of M (implying that M is endogenous) in the identification strategy. So, the model does not assume away endogeneity in any of the three key relationships. They argue that there are three exclusion restrictions involved in the model to account for the three targeted causal relationships: X->M, X->Y, and M->Y (three separate 2SLSs).
I wonder your opinions on whether this observed high and significant correlation between M and Y is a problem. In other words, do you think use of IV accounts for the endogeneity that is reflected by this high correlation? I'd also be glad to hear if you have any suggestions for possible solutions. Thank you very much in advance for your attention and help.
Stay safe and healthy!
* Dippel, Christian, Andreas Ferrara, and Stephan Heblich. "Causal mediation analysis in instrumental-variables regressions." The Stata Journal 20, no. 3 (2020): 613-626.
** Dippel, Christian, Robert Gold, Stephan Heblich, and Rodrigo Pinto. "Mediation analysis in IV settings with a single instrument." Mimeo, (2019).