Correlations are a very poor measure of effect size, because they have no real life interpretation for the average reader. If I tell you that the correlation between the height of a mother and the weight of her baby is 0·36, can you interpret this?
If, on the other hand, I tell you that every one centimetre increase in maternal height predicts an extra 31 grammes of baby weight, this is easy to interpret. It amounts to a difference of about a kilo between the smallest and tallest mothers you would expect (given that female height has a variation of about 30 cm).
The correlation problem is compounded by the lack of equivalence between correlation coefficients. Spearman's rho and Kendall's tau-b are different to r and to each other when run on the same data. rho is 0·37 and tau is 0·25 in this example.
Finally, as a measure of effect size, correlations do not distinguish direction of causation.
My advice is to measure effect size using a directional measure, using the original scale units.
If the correlation coefficient is an estimate (based on data), then this estimate is associated with some uncertainty. This uncertainty should be considered in an interpretation.
Consider an experiment to estimate r. The point estimate is the "best guess" based on the data used for the estimation. Say this best guess is r = 0.3. To interpret this, it makes a difference if r = -0.1 and r = 0.6 wehe almost equally good guesses, based on the data, or if r = 0.29 and r = 0.31 were already drastically bad guesses, given the data.
A p-value somehow encodes this uncertainty, but it is hard to interpret itself. A slightly better way to express the uncertainty is to give an interval estimate, like a confidence interval.
Simply, the smaller the sample size, the more the correlation that you observe will vary from study to study.
You can either work out the confidence interval as above, or the p value.
To say it simply, the correlation coefficient may be an effect size but what you see will vary from study to study. You must also estimate what the range of future observations of correlation will be.
Correlations are a very poor measure of effect size, because they have no real life interpretation for the average reader. If I tell you that the correlation between the height of a mother and the weight of her baby is 0·36, can you interpret this?
If, on the other hand, I tell you that every one centimetre increase in maternal height predicts an extra 31 grammes of baby weight, this is easy to interpret. It amounts to a difference of about a kilo between the smallest and tallest mothers you would expect (given that female height has a variation of about 30 cm).
The correlation problem is compounded by the lack of equivalence between correlation coefficients. Spearman's rho and Kendall's tau-b are different to r and to each other when run on the same data. rho is 0·37 and tau is 0·25 in this example.
Finally, as a measure of effect size, correlations do not distinguish direction of causation.
My advice is to measure effect size using a directional measure, using the original scale units.
The effect size of correlation is also called the coefficient of determination ( r^2). The coefficient of determination can vary from 0 to 1 and indicates that the proportion of variation in the scores can be predicted from the relationship between the two variables. For example, correlation coefficient r = 0.80, the coefficient of determination is 0.64, which means that 64% of the variation can be predicted from the relationship, and 36% of the variation in mean reading scores cannot be explained.
It is important to test correlation for significance, the null hypothesis will be there are no effect or relationship between variables, the statistical significance of the correlation is refered by a p-value of less than 0.05. This means that the probability of obtaining such a correlation coefficient by chance is less than five times out of 100, so the result indicates the presence of a relationship.
correlation coefficient calculated for a single sample gives us the information about that sample only. These results may be' by chance '. To see the situation of population we must go for hypothesis testing and p-values.
Thanks for the answers. My dependent variable is parametric in nature and independent variable is non parametric in nature. Should use Pearsons or Spearman for calculating correlation coefficeint ?
Mukunda, variables are not parametric or non-parametric, tests are. (And even there, the terminology is extremely suspect. See the link below, for example.) It would be more helpful if you told us what the variables are.
And bearing in mind Ronán's suggestion, is one of them conceptually an outcome variable, with the other being an explanatory variable?
"one of my variables is showing normal distribution and the other showing, non-normal distribution, as per Shapiro-Wilk test."
That means for one variable the test has enough data to give a small p-value (so small that you might think it's "significant"), and for the other variable there is not enough data to get a similarily small p-value.
The test does neither demonstrate that the data is normal distributed nor that it is not normal distributed. The test may only show that the data seems unlikely under the hypothesis that it was sampled from a normal distribution. It is a matter of sample size for the test to come up with strinkingly small p-value. Beeing unable to reject the tested hypothesis does not mean that the hypothesis would make sense. But even worst: if you would reject the hypothesis because the p-value of the test is small, you don't know what the reason is for the low p-value and if (or how) this is relevant to your problem.
So what is your aim?
Do you want to show the "strength" of a linear relationship or that of a monotone relationship? And what will the practical use of that? Are such correlation coefficients really helpful? Will others really understand what they mean and use their values in further research? Or can you think of other benchmark values that might be more useful?
The "strength" of a linear relationship is often measured as (Pearson's) r. So you may provide the estimate of r, ideally together with a confidence interval. I personally have problem with the practical interpretation of r. Besides the fact that an absolute value closer to 1 indicates a "stronger" and a value closer to zero a "weaker" linear relationship, I would not know what a particular value would tell me. If you'd tell me that r was 0.3, say with a confidence interval from 0.1 to 0.6, I just could not judge if this is good or not, if the linearity of the relationship is of any use or not, or under what circumstances and conditions the linearity would be useful. But if you'd give me a range of a predictor variable and the average and/or extreme residual (difference between the observed and the predicted response (usnig a linear realtionship)) I might get an idea where and if the linear relationship is useful (given I knew the purpose or the application).