In many probability-statistics textbooks and statistical contributions, the standard deviation of a random variable is proposed to be estimated by the square-root of the unbiased estimator of the variance, i.e. dividing the sum of square-deviations by n-1, being n the size of a random sample. Does anybody know why such an estimator is anchored in the statistical literature? Is there any technical reason supporting the use of this estimator?
Both conventions provide "consistent" estimators of the true population variance of a normal distribution. The "1/n" version is the maximum likelihood estimate of the population variance, however, it is also mathematically biased. The "1/(n-1)" convention provides an unbiased estimate of the true population variance. This is the main reason the 1/(n-1) convention is used, particularly for modest to small sample sizes.
But all this is somewhat uninteresting for large n as both estimators become negligibly different as sample size increases.
(n-1) is called the degrees of freedom and if we use n the estimator will be unbiased
Both conventions provide "consistent" estimators of the true population variance of a normal distribution. The "1/n" version is the maximum likelihood estimate of the population variance, however, it is also mathematically biased. The "1/(n-1)" convention provides an unbiased estimate of the true population variance. This is the main reason the 1/(n-1) convention is used, particularly for modest to small sample sizes.
But all this is somewhat uninteresting for large n as both estimators become negligibly different as sample size increases.
Following ref. [1], considering x1, x2, ....., xn a random sample from a distribution with mean \mu and variance \sigma^2, then
Sn^2=sum_{I=1}^n(xi-xm)^2/(n-1) is the unbiased estimate of sigma^2. To show this, we have to prove that
E[Sn^2]=sigma^2.
Taking the estimate of sum without deviding with (n-1), it is easy to show that
E[sum_i^n (xi-xm)^2]=(n-1) sigma^2, where xm =sum_i^n xi/n. Hence, division by n-1, gives the estimate of Sn^2 equal to sigma^2=Var(xi).
[1] Dekking, Kraaikamp, Lopuhaa, Meester, Modern introduction to probability and statistics.
It is the issue about degree of freedom.
The degree of freedom is the number of values in a calculation that we can vary.
E.g. To estimate mean, there are N observations in an experiment, but one parameter that needs to be estimated. That leaves N-1 degrees of freedom for estimating variability.
Consider the following question:
Freely assign values to a dataset containing three variables (n=3) and a mean of 10.
Given the variables are X1,X2 & X3 values can be freely assigned only for two of the above variables. Since we need to satisfy the mean, our third variable has to be fixed - not free. Think that you have assigned X1=10 & X2=5. The value for X3 invariably become 15 because:
(X1+X2+X3)/n = 10 :
(10+5+X3)/3 = 10
X3 = (30-15) = 15 ~ X3 is fixed by the values of 'n' and mean
Our degrees of freedom hence is, (n-1) OR (3-1) = 2
In general, degree of freedom = n – no. of parameters estimated
Now, you need to estimate standard deviation, so n-1 is the degree of freedom and need to divide the sum of square-deviations by n-1, while for population standard deviation, it is divided by n instead of n-1.
Thank you all. I agree with most of your comments. Unfortunately, the question is a bit more involved. The unbiassed estimator of variance is
S_{n-1}^2 = (1/(n-1)) \sum_{i=1}^n ( x_i- \bar{x} )^2
and then E[S_{n-1}^2] = \sigma^2. That is OK.
However, taking square-root of S_{n-1}^2, i.e. S_{n-1}, we obtain a BIASSED estimator of the standard deviation due to the Jensen's inequality. As the square root function is a strictly concave function, the Jensen innequality implies that
\sigma=\sqrt{sigma^2} > E[ \sqrt{S_{n-1}^2} ]
The inequality is strict as the square root is not a linear function. I.e. we can not move the \sqrt in the integral represented by E[...] out of the integral!
Therefore, S_{n-1} is a BIASSED estimator of the standard deviation and the only property that seems to justify de use S_{n-1} fails. Specially for small n, sample sizes for which the estimator is commonly recommended!!
On the other hand, for a normal distributed sample, the maximum likelihood estimatori of the variance \sigma^2 is
S_n^2 = (1/n) \sum_{i=1}^n ( x_i- \bar{x} )^2
as J. Kern told us. The invariance theorem of maximum likelihood estiamators assures that S_n ( not S_{n-1}) will be the maximum likelihood estimator of \sigma. This is only valid if the distribution is normal, and in exploratory statistics this is not necessary true. I would add that the mean square error of S_n^2 is less than that of S_{n-1}^2 what makes the use of n-1 more extrange. And finally, as evident, the numerical values of both estimators tend to be very similar when n is large.
So, I should insist in my original question. Thank you all!
Dear Juan José
Your question is a very interesting one. I think that you have just the answer below your eyes.
You say:
\sigma=\sqrt{sigma^2} > E[ \sqrt{S_{n-1}^2} ]
and you may see that E[ \sqrt{S_{n-1}^2} ] > E[ \sqrt{S_n^2} ] then
\sigma=\sqrt{sigma^2} > E[ \sqrt{S_{n-1}^2} ] > E[ \sqrt{S_n^2} ]
then there is greater bias using sqrt{S_n^2} than using E[ \sqrt{S_{n-1}^2} ] then this last estimator would be better from this point of view that is the same reason because S_{n-1}^2 would be better than S_n^2.
I found useful the explanation in http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation
There, it is shown that when the sample of Xs are independent and normal you can obtain an unbiased estimator using: \sqrt{S_{n-1}^2} /c4, where c4 is a function of n. An aproximation would be use \sqrt{S_{n-1.5}^2}
I am highly appreciative of all the scholarly responses but I appreciate more response Abdelrahman that to me is nearest to justification. Mean is used to calculate standard deviation. If mean is multiply by n we have the sum of the variable. In this way with n- 1 values of a variable, one can get nth value. So one loses a degree of freedom and has to work out standard deviation using n-1 instead of n.
There is another good reason to prefer the usual standard deviation estimator, S_ {n-1}, instead of the other alternatives, specially when the sample is small:
Many times we estimate the standard deviation (sigma) as a means to construct tests of hypothesis or confidence intervals over the population mean using the Student's t distribution. In that cases, it is indicated the use of S_{n-1} because of the genesis of the t distribution as the ratio of a standard normal and the square root of a Ji square distribution over its degree of freedom. t = z / \sqrt(S_{n-1}^2/Sigma^2)
Dear Guillermo,
I appreciate your answers, specially the first one about the bias of S_{n-1} which is less than that of S_n. I did not realised on this. Thank you.
The second one was my original impression, a mixture of historic inertia and aesthetics in some formulas, all of them linked to normal sampling.
Dear Mohammad,
I agree that that intuitive reason has some weight. But I think that it is mainly valid when applied to the estimator of variance S_{n-1}^2 better to the standard deviation.
After all the explanations, I feel that the reasons to support S_{n-1} as estimator of standard deviation are still weak. They are essentially: reasons related to biassing; and intuitive ideas about degrees of freedom. However, other properties as mean squared error, or maximum likelihood estimator (normal case) of the estimator are ignored.
Tahnk you all.
Dear Juan José
I agree with you that for the preference of S_{n-1} over S_{n}, there is a mixture of historic inertia and aesthetics in some formulas, but usually we replace something for a new thing when we find any advantage on doing so, I do not see it yet in this matter.
Then I think that my second answer (the use of S{n-1} in the t of Student) explains why we prefer S_{n-1} over other unbiased or near unbiased estimators, but we could easily replace it by S_{n} in the same example.
Dear Juan,
Standard deviation is also described as average (mean) value of dispersion around the mean of the variable. If we multiply variance we can get nth value by subtracting sum of n-1 squared deviation from the mean. By argument we loose 2 degrees of freedom, so one can argue we should calculate standard deviation dividing sum of squared deviation from mean of the variable by n-2. However, it is not the case. Mean is always obtained dividing sum of values by number of values (n). Therefore, standard deviation must be calculate which by definition as average (mean) value of dispersion around the mean of the variable, but since its calculation involves mean of the variable and since in calculation mean of the variable one has lost one degree of freedom, therefore to account for that loss, standard deviation is worked out by dividing sum of squared deviations from mean by n-1. Since, it is justification from classical statistics, new justification may be found or classical may be undone. It is because of calculation of Pearson’s correlation coefficients, means of two variable are used therefore significance of is correlation coefficients tested using Student's t-distribution with degrees of freedom n − 2.
This is taking too long. It was already mention that we divide by n-1 when estimating the variance because in that way the estimate is unbiased. That is all that there is to it. To prove it just consider the expected value of the estimator for a set of independent identically distributed random variables with a well defined variance.
If you read the question you will understand why my "bias to complete samples":
"In many probability-statistics textbooks and statistical contributions, the standard deviation of a random variable is proposed to be estimated by the square-root of the unbiased estimator of the variance, i.e. dividing the sum of square-deviations by n-1, being n the size of a random sample. Does anybody know why such an estimator is anchored in the statistical literature? Is there any technical reason supporting the use of this estimator?"
A mathematical proof can be read here: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=6&cad=rja&ved=0CFQQFjAF&url=http%3A%2F%2Fpascencio.cos.ucf.edu%2Fclasses%2FMethods%2FR%2520demos%2FProof%2520that%2520Sample%2520Variance%2520is%2520Unbiased.doc&ei=f0XkUuK_F4i2yAGg9ICwDw&usg=AFQjCNGcoTz-kzNaP27o1KtYzq6sbsXLSA&sig2=xQfuGBJMHBuBFPzXyG068g
I sincerely hope this doesn't start one of those interminable, counterproductive, and sophomoric discussions on trivial matters. At once I tell you: I refuse to take part of it.
Statistics books say that the reason for this (sample standard deviation) formula is not to underestimate the population standard deviation. By subtracting a value from n, the quotient would become larger. The value could be 0.1, 03, 07, 09, etc. I guess 1 would be the simplest, the easiest to compute (during those earlier times)
I don't mind being down voted when wrong. My answer may have "shocked" the mindset belief of someone else, but if you really read statistics books (which I advise you to do), the reason why n-i is the divisor for the sample standard deviation as opposed to the N as divisor of the population standard deviation is that millions of trial computations show that if the divisor of the sample standard deviation is n, it always underestimates the population standard deviation. By reducing the value of the divisor, you increase the quotient. From 0.01 to 0.09999 to 1, the easiest to use would be 1. Whether you like it or not, or you are shocked or not, I just hope that I was able to provide you with a clarificatory answer that may be able to help you in the future. Peace.
In the way that Fausto is mincing words, I will reluctantly do a little nit-picking of my own.
The original question pertains to the use of (n-1) in the denominator of the usual estimator of the population variance of any distribution. As Fausto points out authors are not always careful to state all of their assumptions however some are implicit.
The question pertains to the usual estimator of the population variance, implying that the question pertains to situations where this estimator is indeed correct--complete samples. Therefore Fausto's objection to the answers on the grounds that incomplete samples were not considered is baseless and not germane to the original question. of course there are other estimators of the population variance when data are incomplete.
There are also other, sometimes better, estimators when unequal probability sampling is applied, or when an underlying probability distribution is assumed. But like Fausto's straw-man objection, these issues are not the focus of the original question.
Had the author of the question been interested in incomplete samples, unequal probability sampling, or particular distributions or any other particular situation, he would not have referred to the formula with (n-1) in the denominator as other estimators are appropriate in each of these other specialized situations.
The question is clearly referring to situations of some form of unbiased sampling design for which the (n-1) formula is appropriate. This would include simple random sampling, systematic designs and other unbiased plans and notably almost any underlying probability distribution.
The (n-1) formula provides an unbiased estimate of the population variance under very broad assumptions of unbiased sampling and any probability distribution with constant mean and finite variance. That is all there is too it.
Fausto should retract his petty negative ratings.
Thank you Fausto, I'll be forever indebted to you for correcting my misguided thinking.
This response might be a bit late, but I recently wrote a small article about this topic. In this article, you find an intuitive explanation about when to use N-1 and when to use N. Also I provided the mathematical proof (maximum likelihood estimation and bias/variance derivation) to show when to use which estimator: http://www.visiondummy.com/2014/03/divide-variance-n-1/
It is to avoid bias problems. (n-1) deals with this to correct the residuals or errors between large and small samples in representing the universe.
Yes Brian, the explanation is well presented but It is not true in general. It is true if the sample of n from N is done with reposition. When the sample is done without reposition both estimates are biased.
Dear Brice and Guillermo and all previous contributors,
I would like to thank you for your opinions about my original question.
Brice directed me to an interesting web-site (Khan Academy) where the question
of biased-unbiased estimation of variance is very didactically addressed. The last comment by Guillermo Ramos is also to be taken into account.
However, my question was not on the bias of the variance estimator but on the standard deviation. The reason which supports the use of S_{n-1}^2 as estimator of the variance is that it is unbiased. This property is automatically lost when taking square root to estimate the standard deviation. This is due to the Jensen's inequality. Summarising: S_{n-1} is a biassed estimator of the standard deviation. The initial question was: why still using S_{n-1} as an estimator of the standard deviation?
I think that Guillermo Ramos gave a reason which is not standard but interesting. When estimating the standard deviation, both S_n and S_{n-1} are biassed, but the bias of S_{n-1} is still less than the bias of S_n.
The question which arise from all our comments is why bias is a good criterion to select estimators. Mean square error, logarithmic bias of variance, maximum likelihood, expected likelihood can be alternative criteria but they are systematically ignored.
I would like to close this question here because we have repeated most arguments. Thank you all again.
Thank you Juan Jose for your excellent summary of this interchange.
Dear Brian
By the way: When the sample is done with a Simple Random Sample Design Without reposition E(S_{n-1}^2)= N/(N-1) Signa^2.
Then an unbiased estimate is: (N-1)/N S_{n-1}^2.
Therefore in the example of Khanacademy, if the sample is done without reposition and where N=14, S_{n-1}^2 trends to overestimate by 14/13 or a 7.69% from the true variance.
Although the square root of the unbiased variance estimate is a biased estimated of the sample SD, its use persists because it is very simple and because there is generally no practical value in using an unbiased estimator (the degree of bias is negligible in most - maybe all? - applications).
Thank you Thom. I agree with you. But observe that dividing by n or by n-1/2 is also simple. I would say that dividing by exp(Psi(n)), with Psi the Euler's di-gamma function is quite involved and there is no clear advantage of doing this.
The division by n has an advantage, the estimators of the variance and the standard deviation are both maximum likelihood under normality. It is not a practical advantage but a little bit theoretically more consistent.
I will give two examples :
a) In undergraduate statistics classes, we are taught that to estimate the population variance, the divisor of the sum of the squared differences of each observation from the mean, is n . Meanwhile, when we are working with a sample, the divisor is n-1, simply because we do not know the true value of the population mean, and we are using an estimator, because we do not know the true parameter, we paid for that sin with 1 degree of freedom.
b ) In a formal course of mathematical statistics, we are taught in more depth the topic, and we are explained the desirable properties of a statistic. It is not difficult, the derivation of least squares and maximum likelihood parameters in the case of linear regression. In this derivation, we realize exept the variance, the parameters are the same and conclude that the ML estimator is biased. We still can still fix this to make the ML estimator unbiased, or even better, to use REML estimation.
Thus, we have two simple explanations: One from the point of view of the concept and the other according to the estimation method. In the latter case would not worry about the bias, if the sample size is large enough, the difference between the two estimates of the sampling variance would be negligible .
I hope this help.
Dear Fauto
Thank you for the two cases. May we consider that the weight follows a Normal distribution when it could be sized?
Dear Fausto
For your first problem
CONSIDER the following case: we measure the weight of 10 cakes (Kg), in two sessions : 0.995, 0.990, 0.895, 0,998, 0,960, AND we know that the other 5 cakes have ALL weight > 1 kg (BUT we DO NOT know their exact weight) !!!!!!!!!!!!!!!
HOW do you compute the Standard Deviation of the data?
I got the following estimates of the mean and the standard deviation using a iterative optimization in SAS for the MLE:
mean=1.0058 standard deviation=0.0523
For the second problem I do not know how to proceed.
The denominator of the variance and Standard deviation is always the number of degrees of freedom (n-1). Only if the degrees of freed changes does the n-1 change n- number of degrees of freedom of n-2 or n-3 and so on. Read a book on statistics written post the famous mansucript publich by Biometrika about 100 years ago.
Jeffrey, you need to read more closely .... n-1 provides unbiased estimate of variance but not standard deviation...The question is more subtle than I and most others caught. That said the n-1 formula does provide maximum likelihood when sampling from normal distribution, but still not unbiased for standard deviation....Why not look for an unbiased estimate of standard deviation?
Hi all. If you estimate U as the average of N data values K it means that U=(1/n)* Zigma(ki). It also means that each K value is asumed as an "average" of a sector -within limits we ignore- and equal frequence 1/N, so expected value and expected U are the same. So why do texts speak of estimating SD for N-1 intervals as a "unbiassed estimate"? Why do they use this as a pretext to create a new undefined parameter called "freedom degree"? How can we expect that sound students understand this two versions for the same premise. We need epitemologists of statistics to rescue us. There is an interesting place errorstatistics.com leadered by Drs. Deborah Mayo and Spanos dedicated to discuss points like this one from a philosophical perspective. emilio
I agree with John W. Kern. The sample variance is an unbiased estimator of the population variance but the sample standard deviation is not an unbiased estimator of the population standard deviation.
When X is normally distributed, an unbiased estimator of the standard deviation is given by (in R language):
C_n = gamma( (n - 1) / 2 ) * sqrt( (n - 1) / 2) / gamma(n / 2)
sigma_est = C_n * sd(x) # unbiased estimator of sigma
# see Ben W. Bolch, "More on unbiased estimation of the standard deviation", The American Statistician, 22(3), p. 27 (1968)
Jorge Ortiz-Pinilla
Dear Professor Juan Jose Egozcue,
Standard deviation is a kind of data series measurement which looks how for the data element is from the average of the data series. Hence, one average point (a central point ) is introduced and therefore, we consider (n-1) data points excluding this central point.
Professor Afaq Ahmad. Average of data set requires dividing by N; this implies you have N intervals. Then you use X values as limits of N-1 intervals and adopt the premise that each interval has a media centered, so you need to define again each interval as the average of its left and right limits, and transform starting N medias into N right and left limits of N-1 intervals. What is the support of the "media centered" premise? I fear this is not coherent, it is only convenient from a Descriptive Statistics perspective? Thanks and respectful regard, emilio
I recall that dividing by (n+1) has the smallest variance. The unbiased estimator divides by (n-1), while the MLE divides by n, as it has been stated above by others.
We have a choice of estimators here. Sometimes, we should favor a biased estimator when minimum variance is desired.
The question of Juan José is about the estimation of the standard deviation but the answers are for the variance estimation.
It is true that for large samples the results are similar for the square root of the sample variance and for estimators as the square root of
sum (x - mean (x)) ^ 2 / (n - K)
where K is a finite number, can be 0, 1, 2, ... 8, -5, -2, etc..
However, in my opinion we should look for an estimator with good properties, starting with the unbiasedness. This must be true for a fixed sample size.
The unbiased estimator of the standard deviation is the one I mentioned in my previous answer:
When X is normally distributed, an unbiased estimator of the standard deviation is Given by (in R language):
C_n = gamma ((n - 1) / 2) * sqrt ((n - 1) / 2) / gamma (n / 2)
sigma_est = C_n * sd (x) # unbiased estimator of sigma
You can find this solution in the following reference:
Ben W. Bolch, "More on unbiased estimation of the standard deviation", The American Statistician, 22 (3), p. 27 (1968)
If X is not normally distributed, the unbiased estimator of the standard deviation must be recalculated.
Jorge Ortiz-Pinilla
Thank you Jorge Ortiz-Pinilla. Your reference "Ben W. Bolch, "More on unbiased estimation of the standard deviation", The American Statistician, 22 (3), p. 27 (1968). Is something I was looking for!
Actually n-1 is equal to degrees of freedom. Standard deviation is actually average of change from the mean. For instance, if an average or mean is calculated by 5 values, then first 4 values can be actually any number but the last number has no freedom, because it has to add up to such number that when divided by 5 will give the average number.
The concept is little twisted but I hope you will understand.
an interesting answer in a similar forum , see the link
http://stats.stackexchange.com/questions/3931/intuitive-explanation-for-dividing-in-n-1-when-calculating-sd
Dear Peter Mwangi,
Thank you for your suggestion. I had a look at that forum: interesting!
However, comments there forget that the question was about estimation of the standard deviation and not estimation of variance. The question is that dividing by n-1 makes the variance estimator unbiased. However, after square-root for estimating the standard deviation, the estimator becomes biased (Jensen Inequality). Therefore, the reasons to maintain the divisor n-1 are, at least, obscure.
There were several comments above. Some of them quite sound, but anyway inconclusive.
This video will answer your question in detail.
https://www.youtube.com/watch?v=xslIhnquFoE
This video will answer your question in detail.
https://www.youtube.com/watch?v=xslIhnquFoE
This video will answer your question in detail.
https://www.youtube.com/watch?v=xslIhnquFoE
Dear Sahil
Your video explains very well about the necesity of using n-1 for estimating unbiasedly the population variance, but Juan José knows very well that. He is asking for the use of n-1 in the denominator of S , the estimator of the standard deviation of the population (Sigma).
Dear Juan José,
The main reason is the obtention of an unbiased estimator for the population variance, which is also optimum for a distribution-free setting (Ruiz Espejo, et al. (2013). Optimal unbiased estimation of some population central moments. Metron 71, 39-62; available in my Contributions in this RG portal).
Other reason is that for obtaining this estimator (sample quasivariance) we have n-1 degrees of freedom to estimate the variability in the simple random sample since the deviations are considered with respect to the sample mean, and this is obtained from the n observations of the sample. However, the squared root has an effect of bias for underestimation of the standard deviation "sigma". But this underestimation has few importance asymptotically if the fourth population central moment exists and is finite, when n tends to infinite, in simple random sampling (with replacement) since our estimator converges in probability to "sigma". This is the case of finite populations and much infinite populations as, for example, Normal populations. See also (available article in RG):
Ruiz Espejo, Mariano (2015). Sobre estimación insesgada óptima del cuarto momento central poblacional. Estadística Española 57 (188), 287-290.
Thank you for your question.
perfect question with dozens of incoherent answers. you may wish to calculate "deviations" with a new approach
Article TO DETERMINE SKEWNESS, MEAN AND DEVIATION WITH A NEW APPROAC...
The answer given in the link below make sense and one can analytically found the same.
http://www.dspguide.com/ch2/3.htm
For variance use (n-1) irrespective of population distribution, but for standard deviation the final expression becomes too complicated, one can safely use (n-1.5) for n>=3 and normal-like distribution, or, with lesser error, (n-1.5-excesskurtosis/4).
Dear Juan José:
For simplicity, imagine you have a very small sample whose elements are 1,2 and 3.
The sample mean equals 2.
Now, imagine that I tell you that a sample containing 3 observationswhich are integer values presents mean of 2. Then if in addition I tell you any 2 out of the 3 sample values you immediately can determine the third. Right?
In practice, when you use the sample mean value to calculate variance (and standard deviation) you loose a degree of freedom.
Everything goes like you had a smaller sample.
This is the why you do not take (n-1) when you deal with the whole population.
suppose the random sample be x1, x2,...xn drawn from a population of N(n
Dear Satish Bhat (and many others in the answers to this question),
I agree that dividing the estimator of the variance by n-1 is for attainig unbiasness when estimating the variance. The problem is that, when estimating the standard deviation by taking the square root, the estimator of the standard deviation is no longer unbiased due to the Jensen's inequality. As the only reason for dividing by n-1 was unbiasness, there is no reason for maintainig dividing by n-1 in the estimator of the standard deviation.
I kindly invite you all to review the following paper dated 1968:
Ben W. Bolch, "More on unbiased estimation of the standard deviation", The American Statistician, 22 (3), p. 27 (1968)
Briefly,
If X ~ N(mu, sigma^2) (X follows a normal distribution with mean mu and standard deviation sigma), and if we have a random sample of size n from this distribution, then an unbiased estimation of the standard deviation, sigma, is given by:
S = Cn * sqrt(var(X1, X2, ..., Xn))
where
var(X1, X2, ..., Xn) = sum( (Xi - Xb)^2 ) / (n - 1) (sample variance of X1, X2, ..., Xn)
Xb = sum(Xi) / n (sample mean of X1, X2, ..., Xn)
Cn = gamma ((n - 1) / 2) * sqrt ((n - 1) / 2) / gamma (n / 2)
sqrt is the square root function
and gamma is the known Gamma function which generalizes the factorial function
The following code in R language gives an example:
x = rnorm(20, 50, 5) # a (pseudo) random sample from N(50, 25)
n = length(x)
Cn = gamma ((n - 1) / 2) * sqrt ((n - 1) / 2) / gamma (n / 2)
sigma_est = Cn * sd (x) # unbiased estimator of sigma
Essentially, This is the same answer I have given four years ago.