i think, not really. heteroscedasticity would be when the residuals Show an increasing broader variance along the x-axis or an increasing narrower variance along the x-axis.
i think, not really. heteroscedasticity would be when the residuals Show an increasing broader variance along the x-axis or an increasing narrower variance along the x-axis.
Agreed Manfred. - I cannot really make out the labeling on the graphs, Ann. If you could attach the scatterplots so they could be seen better, that would be helpful.
Heteroscedasticity can be studied well graphically, as well as studying other characteristics. (See attached for more extended graphics.)
Conference Paper Alternative to the Iterated Reweighted Least Squares Method ...
Heteroscedasticity is often modeled with the use of a "coefficient of heteroscedasticity," using a 'gamma.' It is part of an exponent to a size measure, x. In the multiple linear regression you mentioned, a size measure would be used that could either be the dominate regressor, or I think better, a linear combination of regressors, best, I think, if it is a preliminary estimate of y, I.e., a predicted y, but not y itself. That size measure raised to the negative 2 gamma (or negative 1 gamma, depending upon format) is the regression weight, such that I know that in SAS PROC REG, if you set that weight as w=1/x, you get the classical ratio estimator for one regressor and a zero intercept.
A good reference:
Särndal, C.-E., Swensson, B. and Wretman, J. (1992), Model Assisted Survey Sampling, Springer-Verlag.
So the question is, for SPSS, don't you have to set up a size measure for your multiple regression as well, or is it automated somehow? What did you do?
jim
PS - I suppose you could have several error terms (estimated residuals), each with their own factor for accounting for heteroscedasticity with regard to each regressor.
I havent learned anything about gamma's at uni, so all I know is how to do this in SPSS, which is automated. You have to simply plot the residuals and then it gives you a chart. Normally it indeed had to be going wider or more narrow for heteroscedasticity. What I did was Take Standarized and Studentized residuals and plot them with each other (X and Y axis).
I like to look at estimated residuals without first doing anything to them. Anyway, it occurs to me that SPSS may ignore the natural heteroscedasticity in data and that is why many transform. (A good book of interest is Carroll and Ruppert's Transformation and Weighting in Regression (1988).)
i think it best to attach a more legible set of graphs for all to view.
It looks similar to plots I have seen where bands are produced as a result of including categorical variables in the model. There are 13 bands, so there are 13 categories (maybe 12 treatments and a control).
My uncertainty in this answer is focused on the observation that there is no variability in each of the 13 bands in the figure. There are not that many things that I know of that always result in a perfect model fit.
Check for errors. Start from the raw data and redo the analysis from the beginning. I once was exploring my data and started using ratios of different variables. I was excited to find a model with an R2 of 1. I was less excited when I discovered that after simplifying the equations I had proven that 1=1.
I would say that this is an example of heteroscedasticity because the residuals change systematically with a change in "treatment." Manfred gives one example of heteroscedasticity, but it is a term for a more general condition where error variance changes for select groups of individuals.
Probably the only way is to start over. It always feels like a stupid busywork sort of task. Start with your raw data. Go through the SPSS user manual. reconstruct your model. In theory, you should get the same answer the second time round. However, as an error checking activity, you can't assume that it will. Every time you act on the impulse that "I know how to do this" you compromise the error checking part.
The sort of thing I am thinking of here is something like what happened last month. A person asked a regression question because they had an unusual result. They had a simple model one dependent variable and one independent variable. However, their estimated slope and intercept differed greatly from the published value. It took a while to convince them that they had switched the independent and dependent variables. Once they got that straightened out, everything was fine. Your problem if caused by an error will not be that easy.
Here is another (maybe simpler) option. Can you recode your categorical variables as continuous variables? If you do that, does the pattern in the residuals go away or change?
I am sure that you have already noted the distinct bands in Zeng's figure. I would bet that there are categorical variables in that analysis. There are probably 7 categories. Just because there is a banding pattern does not mean there are problems, but it is important to know why the patterns exist. This sort of banding is more what I expect to see. The bands are broad indicating that there is some variability. In your case there is no apparent variability, and I am bothered by that.
Another option is to identify the data that results in all the residuals from one band. Can you identify the conditions that resulted in that band?
If you change the scale at which the graph is plotted do you see variability within a line or would a regression of all the data for one line have an R2=1.0? It is the perception that R2=1.0 for each line that really bothers me.
So there are lots of categorical variables. There might be a problem there. Are some of them highly correlated? Can you reduce the complexity of your model, and does that make the pattern go away? If there were two or three highly correlated independent variables ..... Could try stepwise regression, or a multivariate method.
The bands you see in the residual plot a due to the categorical nature of your dependent variable. If you DV is binary (logistic regression) an ordinary regression would give you a plot with two bands. There is nothing you can do because it is in the nature of your DV.