How can i force a least squares to preferentially fit one region?

If I understood your example case, the situation you face must be similar to the following. https://stats.stackexchange.com/questions/137005/least-square-fit-with-uneven-distribution-of-data

Also when you speak about not using weights, Do you mean not using this :

https://en.wikipedia.org/wiki/Moving_least_squares

In that case, do you really want to use least square. It could be better to use a kernel smoother like stating in the first link. https://en.wikipedia.org/wiki/Kernel_smoother#Nearest_neighbor_smoother

Hope this helps.

James R Knaub

Peter -

Interesting. It sounds like data are skewed in the opposite direction to where they usually are. But heteroscedasticity is going to favor higher weights for smaller predictions anyway, as variance tends to be larger for the residuals and for the prediction errors (so, the variance of the prediction errors) associated with larger predictions. So I think you particularly should look into keeping heteroscedasticity. At the beginning of reading your question, I thought What about stratification into regions? but reading further I see that you are interested in heavier weighting exactly where naturally occurring heteroscedasticity would seem to want to put heavier regression weights.

This is explained in https://www.researchgate.net/publication/320853387_Essential_Heteroscedasticity. (Also see Brewer, K.R.W.(2002), Combined survey sampling inference: Weighing Basu's elephants, Arnold: London and Oxford University Press, especially pages 111, and 87, 130, 137, 142, and 203.)

Natural heteroscedasticity is modified by model specification and data issues, but I believe comes out close to homoscedasticity far less often and to a lesser degree than most take for granted.

You could try estimating the coefficient of heteroscedasticity, explained in the above paper. I looked into this, considering the form

y = y* + e

where y* is as used by G.S. Maddala, a weighted least squares (WLS) prediction, and e is factored into a random and a nonrandom factor, the latter factor leading to the regression weight expression. I was considering linear regression, which includes curved and multiple regression. If your curvature includes parameters truly making it technically "nonlinear," I can't think why this would not still work. Here is a paper on this, followed by a spreadsheet tool:

.............................................

https://www.researchgate.net/publication/333642828_Estimating_the_Coefficient_of_Heteroscedasticity

A spreadsheet tool for estimating or considering a default value for the coefficient of heteroscedasticity, developed for linear regression, is found here (with references):

https://www.researchgate.net/publication/333659087_Tool_for_estimating_coefficient_of_heteroscedasticityxlsx

That leads to the regression weight expression which can be entered as "w" into SAS PROC REG. I assume this is similar for other statistical software. Note that OLS regression is a special case of WLS (weighted least squares) regression, where the coefficient of heteroscedasticity is zero and weights are all equal.

..........................................

Anyway, perhaps the above might be worth a try.

If your indicated heteroscedasticity produces regression weights which also satisfy your concerns, that is, if the above works out well for your application, fine. If not, you might try stratification, if you have enough data. Otherwise, perhaps splines???

Best wishes - Jim

How do you obtain a Letter of Access for Citizen Researchers?

Can DEPC-treated water be used as a replacement for OCT when embedding frozen tissue on cryostat?

How to write literature review?

Does Trizol LS yield more and better RNA than normal Trizol?

Unstructured Interview Design?

What could dissolve MgO nanoparticles over time?

What strategies are being employed to mitigate the environmental impacts of urbanization, particularly regarding habitat loss and fragmentation?

Histology - mitochondria: Hi RG members. Is anybody aware whether a reliable method for determining mitochondrial density exists for light microscopy?

Is my researchGate profile going to be visible on search engine ?

How do climate change-induced alterations in precipitation patterns impact freshwater availability and quality in different regions globally?

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How are iso-frequency contours plotted?

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Why does my protein refolded to beta sheet during thermal denaturation analysis?