You're asking the wrong question. As with all statistical tests, the context is key (the type of data, your hypotheses, the number of observations, what the data look like when plotted etc, etc!). And an r or rsq value, without an accompanying p-value is a bit useless (just like a t value or F ratio without a p value).
Thank you Paul for your answer, I guess you are right, but I thought there might be a low limit that might not be acceptable. Any way the p-value was 0.000.
Generally it is better to consider adjusted R-squared rather than R-squared according to the nature of your data and model.
The major shortcoming of R2 is that only the dispersion is quantified if it is used alone. A model which systematically over/under-predicts all the time will still result in good R2 values close to ONE even though all predictions were incorrect.
You can find the R and R2 value ranges for evaluation of model performance in the attachment, but keep in mind that the presented table is an example in hydrology context (evaluation of hydrological mode performance).
Be aware that translating ranges/data/model characteristics from one discipline to the other is an open-ended discussion. As a rule of thumb, typically R2 values greater than 0.5 are considered acceptable.
Both, R² (adjusted or not) and p-value are "composite measures", that is, they both are kind of ratios of some signal or effect to some noise. Both have no particular general meaning, since it is not clear from their (say, "good") value alone if there is a "good" signal or only very little noise or any kind of state inbetween. Any scientific relevant interpretation of a result can only be based on the signal or effect size. The precision of the measurement/determination is a good thing, though, but it does not put science foreward when we interpret the effects only relative to the experimentally achieved precision.
PS: For the p-value there is one "exception" - it can have a meaning in a decision-theoretic sense, but only if the information of the relevant effect size has been used to determine the sample size. Only then the p-value can be used to decide between two alternative hypotheses with specified error-rates which, in turn, must be based on expected costs and benefits (otherwise the whole procedure is rather non-sensical). But despite this "meaning" or useability of the p-value (if the experiment was appropriately designed), it gives no information about the "scientific quality" of the finding (it is not even about a finding at all - it is rather about a decision!)
The R-squared statistic, or preferably the adjusted R-squared, as Raoof said) is not an absolute value that can be interpreted across datasets. Its only sensible use is for comparing models for the same response variable in the same dataset. This is because of what Jochen said: the statistic is a ratio of signal to noise, and so it can take an arbitrarily high value for a noisy model if the signal is big enough. I have seen people suggest values of R-squared that are "acceptable" in some application, but there is no firm grounding for these, and different disciplines can suggest wildly different values as acceptable.
Yes, and the "acceptability" is not (only) depending on the discipline or "common habits", but on the specific characteristic of your experiment, of your data, and of your aim. Judging models based on R² values alone (without considering the details of the experiment, the data, and the research aim) is as stupid as judging results based on p-values alone (like "p
As far as I'm concerned, the R2 (raw or adjusted) merely serves as an indicator of how "good" one's program is. I think the discussion of those above is appropriate and addresses several key considerations. Ultimately, I think this is a judgement call. I have seen some researchers make a very solid argument on why an R2 of 0.10 was utterly uninteresting (relative to a much larger effect), but I've seen some make just as good arguments as to why the same effect size is very interesting. Part of this has to do with the existing knowledge within the field though (explaining 10% after accounting for 80% is seen by some as a much 'bigger deal' than accounting for 10% when accounting for 0%).
I would just like to add that I think investors/administrators/clients would rather have an unstandardized quantification of the effect (i.e. $X of spending in Y will result in Z fewer rehospitalizations).
R square or adjusted R square is one metric that can be used with many other regression diagnostics to arrive at some meaningful results. It is an over-used statistic that most likely is misunderstood regarding its usefulness.
I find different scholars have different opinions on what constitutes as good R square (R2) variance:
1) Falk and Miller (1992) recommended that R2 values should be equal to or greater than 0.10 in order for the variance explained of a particular endogenous construct to be deemed adequate.
2) Cohen (1988) suggested R2 values for endogenous latent variables are assessed as follows: 0.26 (substantial), 0.13 (moderate), 0.02 (weak).
3) Chin (1998) recommended R2 values for endogenous latent variables based on: 0.67 (substantial), 0.33 (moderate), 0.19 (weak).
4) Hair et al. (2011) & Hair et al. (2013) suggested in scholarly research that focuses on marketing issues, R2 values of 0.75, 0.50, or 0.25 for endogenous latent variables can, as a rough rule of thumb, be respectively described as substantial, moderate or weak.
Thanks Han & Raid, this means that according to Cohen, a model with (F (8,565) = 2.03, p = .041, R2 = 0.03) is a weak predictor of association between variables, while according to others its not a significant model??
I regret that this must be just a list of four authors who have made up some values on the basis of the data they have worked with. The fact remains that R-squared is not an absolute measure of goodness-of-fit.
When will we ever learn that a statistic itself (a mean, a variance, t- or F- or p-values, r or R², AIC, BIC etc) is not a measure of relevance and does not provide an interpretation or concusion when it is seen without intimate relation to the research, the aim, the method etc. There is no value of a statistic "in outer space". It needs to be judged in the scientific and experimental context. That's not what the statistics and the numbers tell us - we must take this from our expertise. If there is no expertise, then there is no sensible way to judge the relevance of any statistic (nevertheless having the data and the statistics increases our experience). Sticking to "common rules" is neglecting scientific expertise (or the need of it), and this is a very bad thing, in my eyes.
I usually give my regression classes one problem on an exam in which I ask them to comment on R sq. Half the class will state "very strong ..." and half will say "very weak ...". Then I explain to the students why Rsq is not by itself a "measure of relevance", as Jochen has said above. They laugh, and they understand the concept. No points are taken off in that problem. and no points are given either! It is not worth a point.
I think of R sq as one out of many statistics that I will obtain on a regression model. It does not mean much by itself. It is also over-used and over-emphasized by some researchers. Rsq can be quite deceptive and misleading in variable selection procedures for cases with multicollinearity and also for the no-intercept model. Some practitioners replace SSE with PRESS in the R sq formula for a measure that targets prediction. Other variations exist.
Just to point out: "r sq" and "R sq" are different things. Back to the original question, if the value of "r sq" is 0.25 or higher then it can be considered that it is within the large effect size class. Nevertheless, I would recommend to seek the practical significance as well.
Depending on theoretical importance I had to consider some atmospheric variables with R^2 > or = .04 to construct a multiple linear regression model for quantitative prediction of premonsoon rainfall. Using ANOVA and Backward Selection Procedure I got a model equation (p
I think this situation varies according to the field of study and subject. For example, if you use it in laboratory studies, a very high r-sq value is required, whereas in natural ecosystems such as ecology, lower r-sq values will suffice.
If there is a study related with ecology which we can refer expressing this, you should suggest it.
For a quick answer, for social science, you can see line 3 of Table 1 of Ferguson (2009: article also includes discussion, which I suggest you also read, in addition to previous commenters comments): http://psychology.okstate.edu/faculty/jgrice/psyc3214/Ferguson_EffectSizes_2009.pdf
It depends on your research work but more then 50%, R2 value with low RMES value is acceptable to scientific research community, Results with low R2 value of 25% to 30% are valid because it represent your findings.
I request all to go through the attached file where I explained why Standard Error (S) is a better choice than R-squared for comparing regression models.
Regarding the errors we must consider the Residual Plots to check if there is an error in the model.
What is the acceptable r-squared value in scientometrics? The answer may be of interest to all members of the RG community. As you know, the RG team did not publish their formula for Research Interest (RI) or Total Research Interest (TRI) on the RG site. Estimating the value of researchers' works is a subject of scientometrics.
Recently, I published an article, "The meaning of Research Interest in ResearchGate" (https://www.researchgate.net/publication/356160597_The_meaning_of_Research_Interest_in_ResearchGate). In this work I "presented the model, which allows making sense of Research Interest (RI). I aimed to get my own formulas, which could provide an estimated RI close to an actual RI on the RG site in most situations."
One of my formulas has an R-squared value of 0.66, and the other is 0.98. I would appreciate your opinion.
You can have a good model for means, but not so much for predicted values, just because sigma for epsilon is large. See https://data.library.virginia.edu/is-r-squared-useless/ where they note that R-squared is not a measure of fit. I think they might mean because sigma for residuals can be large, but you could still have a good model for means even with low R-squared. You may be well accounting for what can be accounted for.
For the difference between confidence intervals of means and prediction intervals, see the following from Penn State:
4.11 - Prediction Interval for a New Response | STAT 462
https://online.stat.psu.edu/stat462/node/127/
4.10 - Confidence Interval for the Mean Response | STAT 462
For example, in scientific studies, the R-squared may need to be above 0.95 for a regression model to be considered reliable. In other domains, an R-squared of just 0.3 may be sufficient if there is extreme variability in the dataset.