Specifically, I want to know if I can use multiple linear regression to predict a response value 'y' from it's associated "treatment" or "group" averages in several variables. Let's say: Predicting a person's weight (y) from the average weight of people in the same country (x1), the average weight of people of the same age (x2) and the average weight of people of the same ethnicity (x3). This should be an ANOVA problem assuming we have data on 3 or four coutries, 3 or 4 age ranges and 3 or 4 ethnicities.

But, can I also propose a linear model as:

y = A +B1x1 + B2x2 + B3x3 ...?

...where, x1 to 3 are the averages previously mentioned.

Can this be simply solved using a linear regression algorithm?

Are there any statistical biases from doing this?

I'm also looking for literature references on this one. All over the internet people claim ANOVA and linear regression are pretty much the same thing. However, I would like to read an academic article where multiple linear regression has actually been used to solve a multiple-way ANOVA. Just to know if sombody has used it the same way as I propose.

More R. Carrasco Hernandez's questions See All
Similar questions and discussions