Post-pre change or post scores in predicting treatment response give different cross-validated r-squared. Which should I consider?

More Massimiliano Grassi's questions See All

What is the correct way to calculate confidence intervals of an AUC obtained by merging/pooling predictions from different test sets?

I have one question regarding the CIs of the AUROC calculated merging/pooling the predictions coming from different test sets. In one analysis, we use a sort of nested cross-validation approach,...

08 September 2018 4,273 0 View

Is there a general and model-independent way of calculating prediction intervals in machine learning for regression task?

I’m training some supervised machine learning algorithm to perform the prediction of a continuous variable. I’m currently applying a nested cross-validation protocol (inner: LOOCV; outer: LOOCV;...

01 February 2018 8,903 5 View

In cross-validation, which is the AUC population parameter I really want to estimate?

I’ve found a lot of different procedures to calculate the AUC confidence interval of a cross-validated model. it may sound quite theoretical but it is not clear to me which parameter these CI...

01 February 2017 6,086 7 View

Which Bootstrap for Confidence Interval of AUC with Leave-Pair-Out-Cross-Validation?

I have to calculate the CI of the AUC (Roc) for a series of classifiers (e.g. Lasso, Random Forest, SVM) learned using the same test dataset, in order to identify the best model for this problem...

11 December 2016 379 4 View

Topic analysis with rarely occurring topic and small document corpora. Which technique should I use?

Hi everyone, I need to perform a topic analysis on various corpora of documents and I need a procedure that can be applied to all of these corpora independently in a standard way. These are the...

09 October 2016 2,000 3 View

Recursive feature selection with cross-validation in the caret package (R): how is the final "best" feature set selected?

The rfe functions in the caret package allow to perform recursive feature selection (backward) with cross-validation. It is expected that the best features selected in each fold may differ, as...

08 September 2016 5,759 4 View

Does publication bias affect the meta-regression slope coefficient?

Hi everybody,differently than in meta-analyses, the effect of publication bias in meta-regression seems to me less severe for the slope coefficient, In my opinion, a bias in the slope coefficient...

04 May 2016 925 7 View

Power analysis in meta-regression?

Hi everyone,Is any package/code available to calculate power in meta-regression (random-effects, DL estimation)?None is available in R, as far as I know, but maybe it exists for another language...

04 May 2016 3,970 4 View

How is it correct to optimize a binary classifier output threshold with ROC and LPOCV?

Hello everyone and thank you in advance for you help! I'm building a screening tool with a machine learning algorithm. The model provides a probabilistic prediction (i.e. logistic regression,...

03 April 2016 4,468 7 View

Which Post-Hoc Strategy for a Poisson Repeated-Measure ANOVA?

Hi everyone and thank you for you advice.I'm running a two-way repeated-measure ANOVA, with two groups of subjects undergoing two different treatments (coded as: 0; 1) x 3 assessment times (coded...

07 August 2015 2,625 3 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Is there an alternative to a multinomial regression which allows the DV to be non mutually exclusive?

I am trying to analyse data from a survey examining what variables affect teachers perceived barriers to incorporating technology into their classroom. I have 5 predictor variables however my DV...

06 August 2024 1,752 3 View

In order to run Multinomial Logistic Regression, is it required that the data be in the long format?

I am using unit level data (IHDS round 2) & Stata 17

06 August 2024 5,725 2 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Could dyes amplify the spectrum of light to a specific wavelength?

I am interested to know the behavior of dyes toward light. Specifically, Blue dyes re-emit the spectrum, especially from the green zone (known as principal in LED lamps, and blue dyes are known...

05 August 2024 3,290 1 View

How to report results of Generalised Linear Mixed Models in a journal article?

Hi everyone, If you have written or come across any papers where Generalised Linear Mixed Models are used to examine intervention (e.g., in mental health) efficacy, could you please share the...

04 August 2024 4,130 4 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Jochen Wilhelm

I think any flavour of r² is not quite a useful metric in practice. I think a clinician would like to know a credible range of the post-score, or possibly a credible range of the score-change. Depending on the application and other things the clinician has to consider, it may be good to have ranges for several degrees of credability, like 50%, 90%, and 99% or so. I think that would be most useful.

James Renwick Beattie

The difference between post-pre score and post score is that the former takes into account baseline variation, while the latter doesn't.

This means that the post score is a combination of 2 sources of variation - the baseline score and the change induced by the intervention.

A regression model on the effect of intervention would be expected to give zero information on baseline status (unless some bias is present). This means that post score increases the total variance but adds zero explained variance, leading to a lower R2 as you observed.

This will then apply to Jochen's answer. I am not sure why you are interested in R2 either, but statistical tests are based on comparing variations and so post score will always be dirtier than pre-post scores due to the mixing of (hopefully) independent sources of variation in the former.

Massimiliano Grassi

Thank you Jochen! Your are absolutely right for a clinical point of view.

And thank you James for the explanation of difference between the two R2!