Selecting "candidates" for Latent Variables: linear PLS path modeling setting

Tom Booth Popular answer

Hi Both,

A couple of things and maybe some suggestions. Components derived from PCA are more equivalent to formative constructs in SEM. A standard SEM latent variable (reflective) is akin to a common factor model, and models shared covariance between indicators. This distinction is important as it is making different claims as to the associations between indicators and latent variable.

However, on a more practical note for your issue Alexander - you can either be drive by theory or data. Driven by data, applying a data reduction method to the brain regions and modelling these as variables in your SEM would be OK. Another option may be to go with theory, perhaps look at network studies that suggestion connections between regions known to be of importance for your particular sample/tests.

My other question would be on what cognitive measures you have? This could be important for the overall modelling strategy? Do you also have comparable data from controls?

Tom

Mohammad Tahir

Dear Alexander V Lebedev!

I am as new as you in SEM. However, there happen to be two kind of variables in SEM. Those which are observed/measured variables and those which are not observed/orbitrary variables (The latent variables). The variables which have paths coming to it are called endogenous variables while those having no paths pointing to it are exogenous variables.

I think PCA would already have given you some idea through principal components i.e. you would have named them. What I think is that you can use them as your latent variables. Probably, following excerpt of wiki would be helpful.

....."In statistics, latent variables (as opposed to observable variables), are variables that are not directly observed but are rather inferred (through a mathematical model) from other variables that are observed (directly measured). Mathematical models that aim to explain observed variables in terms of latent variables are called latent variable models. Latent variable models are used in many disciplines, including psychology, economics, machine learning/artificial intelligence, bioinformatics, natural language processing,Management and the social sciences.

Sometimes latent variables correspond to aspects of physical reality, which could in principle be measured, but may not be for practical reasons. In this situation, the term hidden variables is commonly used (reflecting the fact that the variables are "really there", but hidden). Other times, latent variables correspond to abstract concepts, like categories, behavioral or mental states, or data structures. The terms hypothetical variables or hypothetical constructs may be used in these situations............"

http://en.wikipedia.org/wiki/Latent_variable.

Alexander V Lebedev

Hello Tahir,

Thank you for your response. I should have clarified this from the very beginning. I am looking for the best strategy to identify candidates to form my LVs (they are going to be reflective constructs). For clinical data, I can use several scales to define one LV (for example, NPId and MADRS depression assessment scales in order to define LV "Depression"). For imaging data this does not seem to be so clear... Because, say, you can define "striatum" as a LV and include caudate nuclei and putamen (neuroanatomically and functionally related brain structures)... But, on the other hand, you can define "strio-pallidar" system as a LV and include caudate, putamen, but also globus pallidus... Therefore, I was wondering whether there are any unbiased solutions for doing this.. Of course, I can try different combinations and afterwards just select the best... But this is a very biased approach, I think..

Tom Booth

Hi Both,

My other question would be on what cognitive measures you have? This could be important for the overall modelling strategy? Do you also have comparable data from controls?

Tom

Garett N Howardson

Hi Alexander,

I think what you describe (PCA then R^2) represents a piecewise approach to answering the questions you ask. I think you may want to check out exploratory causal modeling that can help you model both the latent (i.e., PCA) and structural (i.e., R2) aspects of your data. The TETRAD project is a good example of such methods, but there are many others.

http://www.phil.cmu.edu/projects/tetrad/

I think Mohammad also has a point that often people don't consider whether they are trying to measure is actually latent. In your case, the imaging data you have truly are indicators of latent variables because you are not measuring brain structures per se but the chemical/electrical signals of those structures (as far as my limited understanding of neuroimaging goes). Thus, it seems reasonable to assume that the data you have (imaging data) are manifest indicators of brain structures and the exploratory causal modeling might be helpful.

Alexander V Lebedev

Thank you very much for your response Tom! This is exactly what I have been looking for. Provided more details in the personal message.

Rakesh Pandey

I think that it would really be difficult to use the structural (anatomical) measures as indicators of some latent variables in SEM despite the fact that while doing so one may get a good model fit. The Problem will be how to interpret or name the latent variable whose indicators are structural and not functional. The distance mapping based multivariate statistics may be of more help such as multidimensional scaling, cluster analysis etc. But this suggestion is based on my reasoning with the very little knowledge of statistics and not on the basis of my expertise of the area.

If your indicators are functional in nature then I think that SEM approach may be more helpful and you may label the Latent Variables as a specific neural circuit.

Tips on measuring GSR responses in humans?

Thresholding SPM t-maps with AlphaSim (fMRI)?

How can I prepare virus for a TEM or SEM imaging?

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

Is it possible to use the Fused Deposition Modeling (FDM) to additively manufacture interconnected porous structure generation of >100-200 micrometer?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?