When comparing two different regression models (PLS and ANN), is that always true that the model with higher R2 should have lower RMSE?

More Iman Tahmasbian's questions See All

Are there anew transformer used in NLP working best than BERT, RoBERTa?

I am searching for a new transformer that can be used in the NLP

24 April 2024 8,659 9 View

Help to find multimodl fake news datasets in english language?

good greeting Can I get help finding a dataset for multi-model fake news and downloading it?

22 April 2024 1,695 9 View

How can we assess the local people's attitudes toward wildlife?

We have recently chosen a flagship species for a protected area in Iran. Before introducing the flagship species to the local people, we want to measure the impact of the introduction on the local...

03 March 2024 9,356 3 View

What are the meaning of Attention Mechanism?

How can I utilize the attention mechanism with a CNN network, and is it applied to the input data before feeding it into the network or to the output values? I would appreciate clarification on...

01 March 2024 2,494 0 View

What do you know about engineering geophysics?

Applications of geophysical methods and their importance in geotechnical investigations

27 January 2024 9,062 4 View

Mimics default empirical formula for material assignment?

Dear Mimics users, in some studies, it is mentioned that they used “Mimics default empirical formular” for material assignment to the bone structure. I would like to ask if there is any reference...

07 January 2024 6,027 1 View

Mimics default empirical formula for material assignment?

07 January 2024 9,823 1 View

What are the requirements for calcining coal with high volatile matter?

Considering that coals have different volatile substances, if the coal has high volatile substances, what conditions are needed to calcine it.

03 December 2023 9,486 1 View

How can i make Ag-Graphene quantum dots nanocomposite in the solid form?

All the methods to synthesize silver graphene quantum dots that i can find are in liquid form. I do not have the facility of testing liquid samples for FTIR, XRD, SEM etc as the testing centre...

11 October 2023 9,011 1 View

Do we usually represent qualitative results as percentages or numbers?

I am a bit confused and I need your expertise help in acknowledging whether we should represent the qualitative results in a semi-structured interview in an anonymous way (few, some, many) or...

10 October 2023 3,974 2 View

How can I prepare virus for a TEM or SEM imaging?

I have virus (viral hemorrhagic septicemia virus) in suspension and the experiment will not involve cells. What level of TCID50 is preferred?

11 August 2024 3,115 1 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Is it possible to use the Fused Deposition Modeling (FDM) to additively manufacture interconnected porous structure generation of >100-200 micrometer?

Usually, additive manufacturing techniques like SEBM, SLS, and SLM are used for interconnected porous lattice structure generation with sizes of >100–200 micrometers. Can the Fused Deposition...

09 August 2024 7,892 0 View

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?

I need to model an anisotropic material in which the Poisson's ratio ν_12 ≠ ν_21 and so on. Therefore, the elastic compliance matrix wouldn't be a symmetric one. In ANSYS APDL, for TB,ANEL...

09 August 2024 5,048 2 View

How can I apply boundary conditions in an orthotropic steel deck numerical model using ABAQUS software?

I am trying to simulate vehicular loading on an orthotopic steel deck bridge section in ABAQUS software. The red arrow mark in the attached figure indicates the direction in which the vehicle will...

08 August 2024 719 0 View

Can you suggest reliable sources defining "3D mesh" and "3D city models"?

Dear fellow researchers, I am currently working on a paper where I need to provide a reliable reference that defines and distinguishes between 3D mesh models and 3D city models. Although I am...

06 August 2024 9,986 2 View

Is there an alternative to a multinomial regression which allows the DV to be non mutually exclusive?

I am trying to analyse data from a survey examining what variables affect teachers perceived barriers to incorporating technology into their classroom. I have 5 predictor variables however my DV...

06 August 2024 1,752 3 View

In order to run Multinomial Logistic Regression, is it required that the data be in the long format?

I am using unit level data (IHDS round 2) & Stata 17

06 August 2024 5,725 2 View

Please explain how the plastic input value should be considered from the true stress-strain curve for the bilinear elastoplastic material model ?

I am working on Abaqus/Explicit(Quasistatic ) for the deformation of the auxetic structure model. Please explain how the plastic input value should be considered from the true stress-strain curve...

05 August 2024 454 3 View

What are the shear and normal stiffness values of an LLDPE liner in 3D numerical modeling of a stockpile?

I am seeking experimental or applicable data for the liner (LLDPE) interface in FLAC3D numerical modeling of a large stockpile. Could you please recommend suitable references? The preferred data...

05 August 2024 3,665 0 View

Luis F. Gouveia

Dear Iman, If you are calculating the R2 based on the observed data (and predict by the model) it will represent the fraction (ratio) of the total variation that is explained by your model: R2=SSmodel/SStotal, being SS=sum of squares. The RMSE is the square root of the sum of the squared residuals divided by n, therefore the sqrt of the average of squared residuals, that is, a measure of the variation not explained by the model.

For linear models the R2=SSmodel/SStotal=1-SSresidual/SStotal and RMSE=sqrt(SSresidual/n) and as such the higher the R2 the lower the RMSE.

I may be missing something but I can not figure out how R2 and RMSE may increase simultaneously.

Are you willing to share additional details (for instance observed and predicted data) as well as the reported R2 and RMSE?

Ali Akbar Safari Sinegani

Dear Iman

As you know MSE and RMSE only depends on the suitability of models. When your data were predicted by models accurately, summation of differences between measured data and predicted data will be low.

R-square depends on the suitability of models and total sum of square or variability of data. Thus in analyzing two data set with different variability, with equal RMSE the R-square may be different.

The other factors are the number of variables and the number of data that would be used in each model. In comparing two models these factors are not necessarily equal. Each model had specific condition and limitation for retaining and using variables.

Dear Ali, I guess we are assuming the same data sets but different models. Could you please clarify the reference you made regarding different parameters in the models as it is not clear to me how the RMSE or the R2 calculation depend on the number of parameters/variables in the model. Thanks in advance, Luis

Iman Tahmasbian

Thank you all for replies.

Actually, I used the same data set for both the models. The only thing that I just realized was the different type of cross-validation I used for the models. For PLSR the cross-validation was Full (leave one out) but for the ANN it was random (matlab nftool dafault).

Can the difference between cross-validations make the confusion?

Dear Iman, Your first question was misleading as you're not dealing with R2 but with Q2 (the prediction ability of your models). In that case, Q2 and RMSE do not necessarily change in opposite directions, that is, you may have a higher Q2 and higher RMSE or vice-versa.

Additionaly, the validation procedure used is obviously a factor to take into account when comparing the suitability of models but may not be the actual reason for the differences (or the only one). Did you checked the residuals?

Lastly, I should make clear that I don't fully agree with Ali answer (if I understood it properly) as neither the R2 or the RMSE reflect the suitability of the model. Therefore I consider incorrect the statement "As you know MSE and RMSE only depends on the suitability of models. " because it also depends on the data precision. In fact, RMSE should only be used when the fitted model was previously considered as "valid" or suitable - no lack-of-fit.

R2 and Q2 do not depend exclusively on the model but also on the data variability (precision). A proper/correct/suitable model may lead to a "low" R2 or Q2 if the data shows low precision. A statistical wrong/unsuitable model can lead to a high R2/Q2. Take a look, for instance, to the "Anscombe quartet" (google for it). If needed I can send you some references and/or examples.

Kind regards, Luis

Thanks Luis.

The first quation was correct and I am dealing with R2 and RMSE.

There might be another posible answere for that and thats is the data pre-paration which is done in nftool (ANN, MATLAB) by default. I think there is a min-max normalization process in nftool which changes the scale of data and the different RMSE might be due to the different scales of the data in PLSR and ANN. While, R2 can be same for both.

Do you think this can be the reason?

By the way, I do appreciate it if you send me the refferences. Always nice to learn more.

Thanks

Dear Iman,

In that case, the normalization step you mentioned will be probably the reason (or at least one of the reasons). The RMSE is scale dependent but the R2 is not - if you multiply all the data by 10 you still get the same R2 but the RMSE will be 10 fold larger.

Going back to your first question, if you are reporting the R2 (not Q2 or R2prediction) then my answer will be the same as previously: using the same data, different models may (and most certainly will) give different R2 and RMSE's. The ones with larger R2 will show smaller RMSE's (assuming no scaling was done). However you should be careful on how you interpret the estimated values. The larger or smaller R2 or RMSE will not give you an answer regarding the suitability of your model - unless you are just trying to describe what you already know and you will, most certainly, end up choosing wrongly an overfitted model as being the "best one". The model validation (by using an external set of data not used in building the model or by using cross-validation or any other approach) will give you an idea on how good/suitable is your model predictive ability - and this is one of the the critical aspects in multivariate analysis. I'll send you some references later on as I'm out-of-office until next week.

Hope it helped, Regards, Luis

Thanks Dear Luis,

Yes, I think this is the source of the problem. As you mentioned, R2 is always same. So that is the reason why the higher R2 has the higher RMSE. Because the RMSE is reported on a bigger scale. ( if reporting on the same scale then would be smaller for the higher R2).

I did consider an external test with 90 samples (training set had 400 samples) for both the models to check how practical the models are.

Thanks for the information and the very nice discussion.

All the best,

Iman