Correlation coefficients from individuals as dependent variables; valid?

22 September 2020 4 6K Report

Hi everyone,

I have been struggling with analysis of a dataset I am working with, but I believe I may have thought of a possible solution. But after a bit of searching I have not been able to see any published research using a similar analysis method, so I would appreciate your input on whether this is valid or not.

I am studying a species of frog that calls in bouts; the frogs call a number of times before falling silent, and then after a certain amount of time starts calling again (i.e. starts a new bout). I want to investigate whether characteristics of their calls change throughout bouts, and what factors influence this. E.g. does call loudness decrease during bouts, and is this effect more pronounced in longer bouts (bouts containing a greater number of calls)?

Unfortunately, I only have 1 bout per male (the dataset is old, and even getting/transcribing 1 bout per male was a laborious task back before digital recorders, or so I'm told), giving me 25 bouts/males in total. I had initially just thought I would include characteristics of all calls from each bout as a dependent variable in an LMM, and include an interaction term between my variable of interest and bout progress (representing the percentage of the way through the whole bout that each call occurred) to see if this interaction was significant. To avoid pseudo-replication (due to many calls from each male within their single bout), I had wanted to include male ID as a random effect, but I came to realize that because each male only has a single bout in the dataset, this is also, in effect, including factors I'm interested in looking at (e.g. bout length) as random effects too. Thus, I thought of the following solution:

For each bout, I could do a simple LM of e.g. call loudness vs progress through the bout, and then store each male's individual beta in a dataframe, thus compressing the change in each males calls over a bout into a single value. I could then use these betas as a dependent variable in a LM, and include the variables I am interested in as independent vars. Thus, if loudness decreases at a greater or lesser rate in longer bouts, I would be able to see this by a significant relationship between that males' betas and their bout length (e.g. if longer bouts correspond significantly to more strongly negative betas). Obviously, only having 1 bout per male is not ideal as you can't be certain that idiosyncrasies among males are driving differences, but by including factors known to influence male calling as covariates, I think this could still be a decent preliminary analysis.

Is this valid? Mainly: does it make sense to have a correlation coefficient as a dependent variable? I get that all of my individual LMs that generate betas for each male would have to meet assumptions etc. Would just regular correlation coefficients (from a pearson or spearman correlation) be better than betas from an LM?

If you have seen similar analysis elsewhere, I'd appreciate the references.

Thanks for reading this, and thanks any info you can provide!

David Eugene Booth

The problem I think is that you would likely have a high multicollinarity among your IVs which would affect your ability to predict bout length. Whether something like adaptive lasso or ridge regression would be helpful is probably an open question. Further, your correlation coefficients would also be measured with error that would not be constant across individuals, again probably leading to some open questions. I think the idea is worth trying if you are willing to put up with tons of questions from reviewers. Good luck, David BTW i know of no such papers but that means little.

Abiodun Christian Ibiloye

@Luke, For a mixed model, R2beta is obviously the ideal choice (Function with that argument is available in r2glmm package of R)

I don't know of similar research either but, including factors known to influence male calling as covariates, yes, "could still be a decent preliminary analysis."

Good luck.

Jorge Ortiz Pinilla

Luke Larter,

I think it would be useful for you to read something about functional data analysis.

Functional Data Analysis may not be applicable to the Correlation(?) DV. What type of correlation (information), should it be the form of the DV data? Perhabs, we need to know his precise research question and objectives. Are you comparing degree of causal impacts, or accessing "pairwise correlation" (looking at direction and strength) or "goodness of fit" for your DV(just a bout/male frog, and their length)?

Your long explanation confuses, you wish to investigate whether characteristics of their calls(length, pitch, etc) change throughout bouts(one per each n(25) frogs) and linearly adding a covariate, am looking at a Data frame of n=25 and two factor columns, covariate (1 level), bout (at # of characteristics levels, 2 or more)... howbeit, you know what answers you need best.

How can I solve C++ Compiler Link Problem with Abaqus2023?

IPSC cell detachment - any thoughts on cause/solutions?

IS anyone familiar with any research around Mental Health Consumers writing case notes in their case files in Health Settings?

Has anyone optimised the TransIT-LT1 reagent for CHO-DP12 cells?

How to interpret significant parametric coefficients group difference without difference in smooths in GAM model??

Does NaBH4 treatment also reduce the fluorescence of constitutively-expressed GFP?

What is the exact mechanism , in which a plasmid gets incorporated into the host DNA of mammalian cells?

How to perform XRD quantitative and qualitative analysis of a material without a reference in the database?

Can I get some feedback about this Research Question?

No theory for research question - what to do?

Is there an alternative to a multinomial regression which allows the DV to be non mutually exclusive?

In order to run Multinomial Logistic Regression, is it required that the data be in the long format?

Absorption coefficient of methane?

Please explain how the plastic input value should be considered from the true stress-strain curve for the bilinear elastoplastic material model ?

"A Markov-like Model for Patient Progression"?

How to report results of Generalised Linear Mixed Models in a journal article?

What are possible strategies can be used to analyze data under sequential explanatory mixed method approach?

Request a single Lecture notes for math as detailed as this that I can find in one place?

Is it necessary to covary exogenous constructs in a structural model?

Why can't academics earn the money they deserve?