i am reading about regression analaysis can anyone help me to guve me a brief ideae about when do we use simple linear regression and what is differnce with correlation coefficient thanks in advance
The two are closely linked. The standardized regression slope coefficient (b1_standardized) in a simple (bivariate) linear regression model is equal to the Pearson (product moment) correlation (r). Also, the p value for the t statistic for testing the statistical significance of the regression slope coefficient b1 (null hypothesis H0: b1 = 0) is the same as the p value you would get for a significance test of the Pearson correlation (H0: r = 0) in, for example, SPSS. The coefficient of determination (R squared) in simple regression is equal to the squared Pearson correlation (r^2).
What is "unique" about linear regression relative to Pearson correlation analysis? Linear regression can be used to make predictions about individual scores based on the regression equation Y = b0 + b1*X + error. You can also derive a standard error of the estimate from linear regression which informs you about the average or "typical" amount of prediction error. You could not do that based on a correlation coefficient alone.
In a sense, regression uses (and contains/provides) more information than the correlation r because regression considers both unstandardized and standardized variables, whereas the correlation coefficient is by definition a standardized measure of association.
As Christian Geiser noted, when you estimate a simple linear regression model with variables X and Y, the slope of the standardized regression equation is equivalent to the Pearson correlation between X and Y.
However, the equivalence of the point estimates does not mean that the 95% CI for the standardized slope is the correct CI for
Pearson r. It is not the correct CI. The method advocated in this video is wrong:
https://www.youtube.com/watch?v=-dSoWqDyT4E
The conventional method of computing a CI for Pearson r yields CIs that are asymmetrical (with the longer side towards 0) except when r = 0 exactly (see the attached image). The method advocated in the video, on the other hand, yields a symmetrical CI for any value of r. Also, it can easily yield a CI with one of the limits falling outside the range 0 to 1. Here is an example (using Stata):
clear
sysuse auto
* Generate z-scores for weight and length
egen zwt = std(weight)
egen zlen = std(length)
* Regress weight on length and show standardized slope
regress weight length, beta
* Regress zwt on zlen
regress zwt zlen
* The slope from this model = Pearson r.
* But the 95% CI for the slope is 0.870 to 1.022.
* Use NJC's corrci package to get the correct CI for r
A regression coefficient has the distinct advantage that it is an interpretable measure of effect. For example, you can use maternal height to predict baby weight and report the coefficient as the increase in expected baby weight associated with a one-centimetre increase in maternal height.
If you reported the correlation, there is no such real life interpretation.