What does standard deviation tell us in non-normal distribution?

Paul G. Ellis Popular answer

Daryoush, (updated to answer more of your original question, 28-11-2016)

Standard Deviation can be calculated for any distribution if you have the observations. (As mentioned above, there are other measures, each useful in at least some circumstances.)

[START OF UPDATE] What it tells you, even for a non-normal distribution, is simply the average of the distance of each of the data points from the mean of the distribution.

Note, in particular, that the SD of a distribution is in the same units as the distribution's variable, and can be viewed as a line-segment along the x-axis of the distribution.

You can also calculate the variance of the data (by not taking the square root in the expression for the SD). But this will be in units that are the square of the variable's units - sometimes a bit strange to interpret. (The advantage of the variance is that you can add and subtract variances, allowing one to analyse how the variance is spread among different "degrees of freedom" - but you'd need to look that up e.g. via 2nd link below.) By comparison, because of the square root sign for the SD, one cannot analyse the SD in the same way. [END OF UPDATE]

But, if you're not already aware of it, there is an essential distinction between the SD of a whole population and that of a limited sample (drawn from a whole, but possibly unobservable in practice, population).

The difference arises as, in the case of sample data, your calculated mean is only a sample mean - an estimate of what would be the population mean used in calculating a population SD.*

For a population SD, the divisor under the square root sign is the number of observations, 'n' (see formulas in 1st link below).

To compensate for the potential error when working with sample data, giving only a sample mean, one reduces the n by one, and so uses (n-1) in the divisor. This compensates for the uncertainty introduced, unavoidably, when having to use a sample mean, and its use results in a so-called "unbiassed" estimate for the sample SD. (The larger the sample size, the smaller the error.)

You can find more on this on the web, e.g. link below.

Hoping this helps - Paul

Footnote, re asterisked point above:

* This is where a relationship to the normal distribution creeps in - even for non-normal distributions - in terms of the distribution of the sample mean itself (as outlined by Alexander above). Compare the following distributions on three levels, e.g. sketched one above the other:

a) the distribution of a population, with a population mean and SD

b) the similar (but, crucially, not identical) distribution of a sample drawn from a population: necessarily less widely distributed along the x-axis, and with a sample mean that's only an approximation to the population mean; and of course its sample SD also, based on this same sample mean

c) the - quite different - distribution of the means of samples drawn from a population - far narrower and tending toward a normal distribution (regardless of the structure of the original population or sample distributions) as the sample sizes increase. For a given sample size, imagine sketching many other sample distributions over (b) above, and marking all their sample means on the x-axis; they will all fall some way above or below the true population mean, and it's the distribution of those points that constitutes (c).

If you consider the chance of the error in the sample means being slightly above the true mean to be about equal to that of the sample mean being slightly below the true mean, then you can get a sense of how the Normal distribution gets into the picture, as an approximation to the distribution of sample means.

I'll stop here, before going more into "Student's t-distribution".

I've spent years trying to get the above into the heads of some students, so forgive me for laying out the way I've found to be most effective in getting across to them the distinction between the distribution of a sample and the distribution of the related sample means.

http://www.macroption.com/population-sample-variance-standard-deviation/

http://blog.minitab.com/blog/statistics-and-quality-data-analysis/what-are-degrees-of-freedom-in-statistics

Paul G. Ellis

Daryoush, (updated to answer more of your original question, 28-11-2016)

Standard Deviation can be calculated for any distribution if you have the observations. (As mentioned above, there are other measures, each useful in at least some circumstances.)

[START OF UPDATE] What it tells you, even for a non-normal distribution, is simply the average of the distance of each of the data points from the mean of the distribution.

Note, in particular, that the SD of a distribution is in the same units as the distribution's variable, and can be viewed as a line-segment along the x-axis of the distribution.

The difference arises as, in the case of sample data, your calculated mean is only a sample mean - an estimate of what would be the population mean used in calculating a population SD.*

For a population SD, the divisor under the square root sign is the number of observations, 'n' (see formulas in 1st link below).

You can find more on this on the web, e.g. link below.

Hoping this helps - Paul

Footnote, re asterisked point above:

a) the distribution of a population, with a population mean and SD

I'll stop here, before going more into "Student's t-distribution".

http://www.macroption.com/population-sample-variance-standard-deviation/

http://blog.minitab.com/blog/statistics-and-quality-data-analysis/what-are-degrees-of-freedom-in-statistics

What is the differences between short communication and full text paper?

How can I simulate moving source in MCNP4C Monte Carlo code?

How can I read CT images in matlab software with real Hunsfield unit?

How to learn more about SPSS and its Application?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

Is Galaxy.org good to use for research for analyzing data and for publication?

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

What are possible strategies can be used to analyze data under sequential explanatory mixed method approach?

How can I interpret the data without the need of solving it manually?

Which test should be used to study association among demographic profile and awarness level?

Why can't academics earn the money they deserve?

Why 3 replicates for most biological assays? Is it enough to examine the data fits normal distribution?