What are the Least Square Regression and Robust Regression?

More Sudheeraka Wickramarachchi's questions See All

Do you think can be any Uranium bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about Uranium ore deposits in world.

11 August 2024 6,720 0 View

Do you think can be any diamond bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about diamond ore deposits in world.

11 August 2024 2,167 1 View

What is the difference between mathematical R^4 space and physical 4D unit space?

We assume that the difference is huge and that it is not possible to compare the two spaces. The R^4 mathematical space considers time as an external controller and the space itself is immobile in...

10 August 2024 6,678 14 View

If Banks do not provide credit facility, what are the options available for FPOs and impact on producer’s income?

10 August 2024 8,198 5 View

Controlling for pupil light reflex when analyzing pupil size time course?

I used eye tracking to examine how participants from two different populations (A and B) react to an image. Participants in population A exhibit larger pupil sizes over time, but they also have...

10 August 2024 3,229 0 View

What are a “Farmers Producer Organization” (FPO) and its essential features?

10 August 2024 477 5 View

Strugglling with m6A dot blot any suugesstion ?

I have been doing the m6A dot blot for a while with no improvement, I am extracting the RNA, and I can see the dots although the three biological replicas give a different reading on the memberan...

10 August 2024 8,539 5 View

Do interactions between biosphere, carbon cycle, & water cycle impact global warming & interaction between atmosphere & hydrosphere?

How do interactions between the biosphere, the carbon cycle, and the water cycle impact global warming and interaction between the atmosphere and the hydrosphere?

09 August 2024 3,291 2 View

How to get moment output in Abaqus Standart?

I have input a moment load in module load Abaqus, i put my moment load on the node surface (using reference point). I have define moment in history output and make a set for moment too. But the...

08 August 2024 4,831 4 View

How is energy cycled through the Earth's climate system and how do matter cycle and energy flow through the rock cycle?

08 August 2024 8,162 0 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Is there an alternative to a multinomial regression which allows the DV to be non mutually exclusive?

I am trying to analyse data from a survey examining what variables affect teachers perceived barriers to incorporating technology into their classroom. I have 5 predictor variables however my DV...

06 August 2024 1,752 3 View

Christer Thrane

1. In the normal lingo, very little. In robust regression, the standard errors are calculated differently, to make them "robust" against heteroscedasticity (or clustering).

2. You interpret the regression coefficient in the same manner: the average change in y given a one-unit increase in x.

3. I have written three textbooks on regression, but I have never heard the term "proper correlation" before. You do regression to find out if there is a (linear) association between x and y and, if so, to find out how large this assocoation is.

4. I seldom use R, but I know for a fact that both plain vanilla and robust regression is are straightforward to estimate in R.

Good luck :-)

David Eugene Booth

This attached teaching unit should introduce you to these concepts. Please ask if have questions. Best wishes, David Booth

James R Knaub

Sudheeraka -

Robust regression puts less emphasis on y-values with larger estimated residuals, and thus a potential outlier would have less impact. It might be better for "dirty" data. But first I suggest you see if the larger estimated residuals aren't just associated with larger predicted-y-values. If you are using OLS regression, that is just a special case of weighted least squares (WLS) regression, with coefficient of heteroscedasticity equal to zero, which is often not a good choice. See https://www.researchgate.net/project/OLS-Regression-Should-Not-Be-a-Default-for-WLS-Regression, and various updates. -

I suggest you try WLS regression. It puts less emphasis on cases with larger predicted-y.

Graphical residual analyses are good to check fit (including heteroscedasticity), and cross-validations help check that you did not fit too closely to one particular sample.

- Cheers, Jim

Daniel Wright

1-3. See James R Knaub , David Eugene Booth , and @Christer answers.

4. lm versus rlm in the MASS package, but there are tons of other functions for these.

,@James weighted least squares was an early approach to solve this problem. There are more efficient methods today. My program for example is pre-horse and I agree with Daniel that the MASS package in R is a reasonable way to go for most common problems. See the MASS book by Brian Ripley. Other modern approaches are in the works of Peter Rousseau at University of Leuveen in Belgium. All of this as well as comprehensive Robust algorithms in R by Rand R. Wilcox of USC can be found in the z-library. I'm attaching one of my favorite papers on Robust logistic regression which we found useful. Our work can easily be found in Google scholar. Best wishes to all, David Booth

No. That is not what I said. Weighted least squares is not particularly for robust regression. I don't know that anyone ever said that it was. It should probably be used most places people use OLS, as that is a special case of WLS that is overused.

What I'm saying is first see if you really need robust regression. You may not have an outlier problem at all. You may just think you do when the problem really is choosing your coefficient of heteroscedasticity equal to zero (OLS), when that was not a good choice. See https://www.researchgate.net/publication/320853387_Essential_Heteroscedasticity, and again, https://www.researchgate.net/project/OLS-Regression-Should-Not-Be-a-Default-for-WLS-Regression.

Sorry if I was not clear. Hope this explanation was better.

Cheers - Jim

PS - Oh. I guess David was referring to using some kind of ad hoc weights for WLS. Nope. Not referring to that.

Carlos Araújo Queiroz

Maybe you can consider the recursive least squares algorithm (RLS) with forgetting factor (RLS-FF). RLS is the recursive application of the least squares (LS) regression algorithm, so that each new data point is taken in account to modify (correct) a previous estimate of the parameters from some linear (or linearized) correlation thought to model the observed system. The method allows for the dynamical application of LS to time series acquired in real-time. As with LS, there may be several correlation equations with the corresponding set of dependent (observed) variables. For the RLS-FF algorithm, acquired data is weighted according to its age, with increased weight given to the most recent data. The correlation parameters are updated gradually.

Application example ― I have applied the RLS-FF algorithm to estimate the parameters from the KLa correlation, used to predict the O2 gas-liquid mass-transfer, hence giving increased weight to most recent data:

Thesis Controlo do Oxigénio Dissolvido em Fermentadores para Minimi...

Robust regression may be used to handle noisy data, and especially if there are some real outliers, but as I said, I suggested WLS first because your real problem may just be that you are not accounting for natural heteroscedasticity. If you use robust regression when not needed, that is like using a nonparametric method such as looking at ranks instead of continuous data. It may solve a noisy data problem, but if not needed, it throws away good information!

**So I suggest you not throw away good information when you do not need to do so.**

The following may be of interest:

"When Would Heteroscedasticity in Regression Occur?" Preprint, June 2021, J. Knaub, https://www.researchgate.net/publication/352134279_When_Would_Heteroscedasticity_in_Regression_Occur

Please note that by WLS regression, I mean the weights are determined by an appropriate coefficient of heteroscedasticity, gamma, inherently present, NOT ad hoc weights used to handle individual suspected "outliers."

I think this gets back to a problem with the original question. The questioner asks folks to compare:

1. a particular estimation used for regression.

with

2. an adjective about a regression.

Because of the popularity of MASS, when I hear regression in the context of R, I assume the person means a particular set of estimation procedures (those available in rlm). But this is of course just an assumption, so I really don't know what procedure the person here is meaning, so as it stands this question is meaningless unless the person clarifies it.