R² ist the proportion of the variance of the data that is "explained" by the model. There is no deeper interpretation. It roughly gives an impression about how closely the data points are located around the model curve, relative to the range spanned be the predictors. This is almost never a very useful measure. The only exception might be in analytical fields where the quality of a calibration curve might by assessed by R² (R² must be very close to 1; otherwise the data are not suited). I would be happy to learn if there is some practical use for R² I am not aware of.
There are two things much, much more interesting:
(i) the estimated parameter values (e.g. in a simple linear regression: the slope of the regression line) and
(ii) the residual variance or a similar measure that tells us how close an observation can be expected to the model prediction.
(i) is often interesting because it tells us how strong the estimated effect of the predictor is. A confidence interval for the estimates can be interpreted as a range of parameter values that are compatible with the observed data. If only the direction of the effect is of interest, one may give a p-value instead which would indicate if the confidence interval is completely on one side "null" or if the data are compatible with aparmeter values at either side of the "null".
(ii) is sometimes interesting, particularily when the model is used for predictions.
Now looking back on the figure already identifies a problem with the reporting of R²:
In the lower left diagram, R² for Tem/Bor forrest is 0.2, and for agriculture it is 0.19. These values seem similar. However, the slope of agriculture is steeper than that for Tem/Bor forrest. Hence, changing the pH has a larger effect on the species richness in agriculture. This is completely lost by looking at the R² values. Of course, to make a point of this one would need to reject the hypothesis that the effects in both environments are equal. I don't know if this was the aim and/or done.
Formal and informal guidelines for R-squared as a standardized effect size measure vary across fields. In some fields like mine (psychology), researchers are often quite happy when they find an R-squared around .20 (even a multiple R-squared derived from multiple predictor variables!). In other fields, a value of .20 may be seen as a small effect.
Cohen (1988) provided effect size guidelines for (bivariate) Pearson correlations r, according to which r = .1 would be small effect, r around .3 a medium effect, and r around .5 or larger a large effect. According to Cohen's guidelines an R-squared of .20 would constitute a medium to large effect (r = .447).
Cohen, J. (1988). The effect size. Statistical power analysis for the behavioral sciences, 77-83.
Experts and professors, thank you very much for your generous response. The significance of statistics seems to vary according to the discipline and research direction. Of course, I have recently found research beyond linear interpretation, which negates the assertion that good fit (higher R2) is the standard of causality. For example, nonlinear time series analysis method, empirical dynamic modeling and so on, but they have not been deeply understood at present. Here are several articles for your reference and later readers.
[1].Chang, CW., Miki, T., Ye, H. et al. Causal networks of phytoplankton diversity and biomass are modulated by environmental context. Nat Commun 13, 1140 (2022).
[2].Ye Lin,Tan Lu,Wu Xinghua,Cai Qinghua,Li B Larry. Nonlinear causal analysis reveals an effective water level regulation approach for phytoplankton blooms controlling in reservoirs.[J]. The Science of the total environment,2021,806(P4).
[3].Review on Causality Detection Based on Empirical Dynamic Modeling