A reason for electing a biased estimation is the result of minimizing the Mean Square Error of the estimator, at least approximately. However, it is possible also minimizing the Variance of a unbiased estimator, solution which resolves simultaneously two problems: centered estimation and minimum mean square error.
Thank you, Subrata, but the higher efficiency is usually approximate when it is compared with other unbiased estimator which does not exploit the same used available information with the biased estimator.
Pierre, the intention with the estimator is to approximate efficiently a parametric function which depends of the values of the interest or study variable, but from a sample of units of the population with determinated sample size.
Yes, sometimes we choose a biased estimation over an unbiased one if the mean squared error (MSE) of the biased estimation is less than (MSE) for unbiased estimator, that is mean the variance of unbiased estimation is higher than the variance of biased estimation such that MSE for biased estimation is less than (MSE) for unbiased estimator where MSE is a summation of variance and biased squared.
1. There is not a uniformly minimum MSE estimator for a parametric function (Ruiz Espejo, 1987).
2. It is elected a biased estimator which uses auxiliary information, over a unbiased one which does not use auxiliary information. The biased estimator has approximately lesser MSE (or MAE, mean absolute error) than the respective MSE (or MAE) of the unbiased one (Cochran, 1977).
But it is possible too that a unbiased estimator which uses auxiliary information has got less MSE (or MAE) for some region of the parameter in RN than the MSE (or MAE) of the biased estimator. This would make admissible the unbiased estimator vs. the biased estimator.
As many commentators state, the MSE has the advantage of taking into account both bias and variability. There is often a trade-off between these (see for example, http://statweb.stanford.edu/~tibs/ElemStatLearn/, which can be downloaded).
However, there are situations where you would choose a biased estimator over an unbiased one even if they have the same variability. It depends on how the estimator is used and what the costs/values of the different types of errors are. For example, suppose after extensive research I find that it takes on average 12 minutes to bake the perfect chocolate chip cookies. But this time varies. If they are over-cooked your dessert is ruined. If they are under-cooked you just close the oven door and check again in a minute or so. Therefore, the estimator that would be listed in the cookbook (if you were only allowed to use one number) would be lower than 12 minutes because there is little cost if you under-estimate the time but high cost (if you like cookies) if you over-estimate the time.
Daniel, do you understand "same variability" as "same MSE"? In this case both estimators are equally efficient, and they are equivalent in MSE. But, in this case, I would prefer the unbiased one because it is centered also.
I'm even saying the MSE of the biased estimator will be less, but I would still prefer to use that as my estimator.
Mariano, if you choose the unbiased estimator (12 minutes) for setting the oven alarm, this would mean that your cookies will often be ruined (burnt) more often than mine. True, I will have to get up from the couch more often (I'm assuming cookies are baked often) and wait a little, but I am assuming the cost of that is less than cookies being burnt. For me, the value of good cookies is large!
Similarly, if the unbiased estimator to drive to the train station is 1 hour, if it is important to get on that train I would leave more than an hour before departure time. But if trains left every five minutes so there was little cost for missing a train, I might leave just an hour before the train I wanted.
The following is an older paper, but it is useful for describing how the cost of different outcomes can inform what you report.
Daniel, in my research I do not think on cooking but in estimating a defined parametric function for approximating its exact value with the data of a sample of the finite population.
Jochen, the aim is not "to determine the exact value" but it is "to estimate the exact value" of the parametric function for a concrete given finite population. I agree with your idea.
Jochen, but the bias of the estimator is usually other known or unknown parametric function to be estimated too. This bias is not known before sampling the population.
to compare the efficiency of estimators you can use mean square Errors (MSE’s) or mean absolute percentage error (MPE), also known as mean absolute percentage deviation (MAPD)
In other sense, we should also look at robustness of an estimator. Since, efficiency and robustness are two opposite poles, when efficiency increases robustness decreases and vice-versa. Do biased estimator also trade off between efficiency and robustness ?
Prediction and robustness are based on supposed models. They are subjective forms of statistical treatment. When the model is objective and with sufficient resources we would be able of measure all the units of the finite population, prediction and robustness are substituted by estimation and efficiency based on realities which could be objectively measured with sufficient resources.
My surnames are Ruiz Espejo (from father and mother). I suggest this article on optimal unbiased estimation for free-distribution settings. In it, the theorem 4.1 was corrected in a future publication (in Spanish):
Ruiz Espejo, M. (2015). Sobre estimación insesgada óptima del cuarto momento central poblacional. Estadística Española 57 (188), 287-290.
I am usually interested in estimation because in such estimation is possible to study objectivity and objective properties of estimators. In prediction, all depends of the assumed model, which could be true with a probability zero almost sure, or false. This occurs also in statistical modelization, when the researcher assigns a supposed model to an ideal population. For example, in the "a priori" distribution of a parameter in Bayesian statistics. In my opinion these are subjective studies.
In this article, I show that the usual linear regression estimator (which is not unbiased) can be improved by other unbiased linear regression estimator.
The traditional criterium for optimization of estimators is "unbiasedness and uniformly of minimum variance".
The criterium "uniformly minimum mean square error" is also "unbiasedness and uniformly of minimum variance" for the parametric function "expectation of the estimator".
Other article on "unbiased multivariate regression estimation" (in Spanish) has been published recently (October 2016). You can see it in the Contributions of my profile in RG.
Thank you for the invitation. I am not professional in this field and, thus, I will not risk an answer to the question. Nevertheless, the discussion thread will be a good opportunity for me to start learning about this interesting subject.
In some circumstances, when there is not a known unbiased estimator for the parametric function, it would be possible to use biased estimation with good accuracy properties.
It is not statistically proper to use biased estimator.
The estimator to be used should have the minimum four qualities namely consistency, unbiasedness, efficiency and sufficiency.
However, if in a situation there does not exist unbiased estimator then in such situation a biased estimator can be used provided that it satisfies the other qualities.
The major purpose to introduce new estimators is to minimize variability in estimates. If MSE of a biased estimator is less than the variance of an unbiased estimator, we may prefer to use biased estimator for better estimation.
An estimator to be used should satisfy basically four properties/criteria. These four criteria are consistency, unbiasness, efficiency and sufficiency. Some other properties are completeness, compactness etc.
However, consistency is an essential property while unbiasness is a desirable property.
Consistency can not be sacrificed. On the other hand, unbiasness is also not sacrificed usually. It may be sacrificed only in unavoidable situation. However, in that situation consistency must be remained intact.
Thus, it is scientifically not correct to use a biased estimator for a parameter when there is an unbiased estimator for it.
Only in unavoidable situation, it is suggested to use a biased estimator (which must be consistent) for a parameter when there is no unbiased estimator for it.
Here you have an article which presents an optimal unbiased estimator for distribution free setting which is better than linear regression estimator, ratio estimator and product estimator among all others based on an auxiliary variable: "Recientes frutos en bioestadística" of 2018, in my Research, Articles.
"Sobre la optimización del estimador de regresión lineal" provides an unbiased estimator based on the same conditions of linear regression estimator and it is optimal (uniformly of minimum variance) for estimating the finite population mean of an interest variable. You can see it in my profile too (year 2022).
I have not found any reason for choosing a biased estimator over an unbiased estimator when the later exists. Only in unavoidable situation, it is suggested to use a biased estimator (which must be consistent) for a parameter when there is no unbiased estimator for it.