I have not done that kind of regression before. However, it sounds like it would fit into the general format of y = y* + e, where y* is predicted y. I have a way to find regression weights for that by first using a preliminary predicted y, say the OLS predicted y values, in place of the weighted least squares (WLS) predicted y values, which will be close for the purpose at hand, to act as a size variable, to then estimate regression weights and then you can do a WLS regression. It isn't as complicated as that might sound. You just have to do an additional model run with the weights, as long as your software can do that. To know what to do, see https://www.researchgate.net/publication/333642828_Estimating_the_Coefficient_of_Heteroscedasticity. You can use https://www.researchgate.net/publication/333659087_Tool_for_estimating_coefficient_of_heteroscedasticityxlsx to do this with your data.
..............................
Below is something I have sent to people on this topic:
If you might be interested, the following is on the fundamental nature and magnitude of heteroscedasticity, for regressions of form y = y* + e, most useful in predictions for finite populations:
["Heteroscedasticity for estimated residuals in regression is not a bug, it's a feature."]
..............................
As I noted, I have not worked with categorical data, but I can see no reason why this would not work here. The paper "Essential Heteroscedasticity," which explains the reasoning is easiest to understand when thinking about continuous data, but I don't see why this should not function for your application.
However, count data can be looked at differently, but I'm not really very familiar with that either.
I just know that I've worked with continuous data, but that my spreadsheet above would seem to work for any case where y = y* + e. If that fits your situation, you might want to use this. The paper "Estimating the Coefficient of Heteroscedasticity" gives some examples.
One way is simply adding terms to allow the variances to differ for the groups.
Chapter 7 of Wilcox ( https://www.sciencedirect.com/book/9780123869838/introduction-to-robust-estimation-and-hypothesis-testing ) includes some functions that use a different approach for higher order ANOVA (which you can frame as linear regressions, with categorical variables these are all linear), and he has some articles that also address this. First though, can you expand on why you feel the model should include heteroskedascity? Often transformations can be useful to address this.
Hello Rubal Mistry. I was going to suggest something more straightforward: Estimate an OLS model that treats all of the explanatory variables as factor variables and include an option to get a robust estimate of the variance-covariance matrix. How to go about that depends on what software you have. In Stata, for example, it would be the -regress- command with i. prefixes on all of the explanatory variables (to treat them as factor variables), and with the vce() option to specify to specify which of the robust covariance options you want (robust, hc2, hc3, etc.).
Rubal Misery, if your ultimate objective is to develop a model, ANOVA is inapplicable because it is for making comparisons.
Regression analysis is the appropriate statistics for determination of relationships. However, you do not do regression analysis with categorical variables in SPSS. You may wish to do log liiear regression using the R statistical package..
Following on from Bruce Weaver 's point, are you interested in differences among the variance terms, or is the lack of homoscedaticity just a concern because of the assumptions of ANOVA/OLS regression, or are you interested in differences in variances for their own sake?
My spreadsheet is pretty easy to use if you've already done OLS regression, and your software accepts the regression weights you find from the coefficient of heteroscedasticity that you decide upon. Weighted least squares should be the norm as the heteroscedasticity is naturally in the error structure. The size measure is predicted y, and sigma for the estimated residuals should increase with larger predicted y. Please see https://www.researchgate.net/publication/320853387_Essential_Heteroscedasticity.