Can we apply SEM to our data to discover a path model based on exploratory factor analysis or should we develop a proposed path for our exogenous and endogenous variables and use path analysis to test the "goodness" of the path model?
I am a bit confused as you mix factor analysis with a path model (which reflects a full latent variable model). If you have not prior theory/set of hypotheses for the structural effects among factors, there is the option to use the open source software TETRAD which prints all model belonging to the same equivalence class (i.e., models with the same data implications). This at least provides some basis to develop the theory or plan further studies. In some cases (depending on data/structure) the output may even be quite clear regarding a causal effect.
If this is what you meant, i would start with this 2h video by Robert Scheines:
Eberhardt, F. (2017). Introduction to the foundations of causal discovery. International Journal of Data Science and Analytics, 3(2), 81-91. doi:10.1007/s41060-016-0038-6
Malinsky, D., & Danks, D. (2018). Causal discovery algorithms: A practical guide. Philosophy Compass, 13(1), 1-11. doi:10.1111/phc3.12470
Exploration of factor models in contrast is done with EFA but i guess you know that so I don't have to say something about that. I will also resist my motivation to highlight its problems :)
In general, you need to have predetermined path model to specify an SEM. There is no "exploratory" method for developing a path model.
You might think of it this way. Once, you use an EFA (or theoretically derived CFA) to develop a measurement model for "latent" variables" in your path models, then this combination is an SEM.
If yes, you may argue (correctly) that in most cases you won't get an precise answer (but only the equivalence class) but actually this is an important outcome that highlights that every (cleanly fitting) model has equivalent versions.
Regarding an EFA, you only get a seemingly clear answer as the model structure (i.e., the common factor structure) is predefined and that may be wrong.
Yes, I read your post, but if an author don't have any path model at all, what does it mean to generate all possible models in an equivalence class? Isn't this a matter of permuting all the variables into all their possible orderings?
And of course there is always going to be difficulty in proving whether any given path model is the "best" fit to the data. So, the goal is to determine whether the specified model does indeed fit the data (a classic instance of a null hypothesis).
Plus, I'm trying to imagine my response as a reviewer if someone submits an article that says, "Rather than specifying a theoretical model for my path analysis, I used a program that generated all possible models within a given equivalence class." Sounds like a candidate for my wastebasket.
you just need the correlation matrix of the variables and TETRAD will print a single path diagram that prints all equivalent models matching this particular correlation matrix. These models will have the same "skeleton" (generic structure) but differ (marked by an innovative style of representing the lines between two variables) in their causal directions. Most--unfortunately--will be marked as totally ambiguous--that is--x may affect y, y may affect x or both result from an unobserved confounder. Others may be partially "oriented" and others may result in a fixed direction, meaning that in all models of the class the link between the two respective variables could only be in a certain direction.
And no, this is not based on simply permuting the variables but--as mentioned, on the identifying equivalence class by means of graphical principles (i.e., d-separation).
With regard to the progression, you creating a strawman in a very (sorry) despicable manner. I would of course not recommend to submit this output as this will probably result in an immediate rejection (unfortunately, as knowing the partial ambiguity of certain partitions of a model is of high interest for the field instead of comprehensively accepting or rejecting the whole model). Some parts may be completely ambiguous (thus, demanding future research) while others may have solid evidence.
My ideal progression would be to start with a confirmatory approach WHEN your theory is so solid that you can come up with a predefined model. In most cases of applying path models or SEM, this is not the case. Why not starting with a pre-study using TETRAD and then--based on the result--enlarge the model or extending it in such a way that the ambiguous parts become less ambiguous (e.g. by incorporating instruments or by planning an intervention). Likewise, a failure in a confirmatory test of a predefined model could be succeeded by a TETRAD analysis to find possible errors. Science is always the circle between deduction and abduction. Model testing vs. exploration is no difference.
It is interesting that if you have a latent factor model and constrain some of the loadings to zero but freely estimate others it is called confirmatory. Might be good to have a phrase for confirmatory where you specify all the loadings, semi-exploratory or semi-confirmatory where you specify some (probably to zero) but freely estimate others, then fully exploratory. You seldom have a full confirmatory model, but in CFA there is often a lot of things not pre-specified. Sorry, this is a bit unrelated to the question, but I thought the thread of responses was interesting.