11 November 2014 21 2K Report

This is a question for philosophical discussion: I saw the interesting example at the link below (excerpt from East Tennessee State University on Freedman, et.al.). It is meant to be a simple example, and only has 11 data points, so even if other regressors were available, we can only speculate about more data points, but it appears that there could be a number of phenomena that might be at work: (1) leverage, (2) stratification, (3) heteroscedasticity, (4) multiple regression, (5) nonlinear regression. A confidence bound on the slopes that could occur with OLS would be wide at each end. However, it appears that it might go through the origin, which would make confidence on a slope appear as two straight lines, forming a v-shaped wedge. But, there are people who die of lung cancer without smoking, so one would expect to see a positive intercept. Regardless, larger x would generally mean larger residuals for y.

What are your thoughts on this graph? What would you tell people who might ask you to evaluate what may be (or have been) happening?

http://math.etsu.edu/1530/Outliers.doc

Similar questions and discussions