When plugging in errors for a simple bar chart of mean values, what are the statistical rules for which error to report? I guess the correct statistical test will render this irrelevant, but it would still be good to know what to present in graphs.
Very good advices above, but it leaves the essence of the question untouched. The CI is absolutly preferrable to the SE, but, however, both have the same basic meaing: the SE is just a 63%-CI. The SD, in contrast, has a different meaning. I suppose the question is about which "meaning" should be presented.
The SD is a property of the variable. It gives an impression of the range in which the values scatter (dispersion of the data). When this is important then show the SD.
THE SE/CI is a property of the estimation (for instance the mean). The (frequentistic) interpretation is that the given proportion of such intervals will include the "true" parameter value (for instance the mean). Only 5% of 95%-CIs will not include the "true" values. If you want to show the precision of the estimation then show the CI.
However, there is still a point to consider: Often, the estimates, for instance the group means, are actually not of particulat interest. Rather the differences between these means are the main subject of the investigation. Such differences (effects) are also estimates and they have their own SEs and CIs. Thus, showing the SEs or CIs of the groups indicates a measure of precision that is not relevant to the research question. The important thing to be shown here would be the differences/effects with their corresponding CIs. But this is very rarely done, unfortunately.
Dear Darren, In a bar chart for mean comparison always the difference between groups implies the confidence interval. Besides, confidence interval is a product of standard error* T-student from the table with defined DF and alpha level. The difference between standard error and standard deviation is just a sqrt(n), in other words standard error obtain from dividing standard deviation by square root of sample number in each group.
So th difference is not of vital importance, however, showing standard deviation is more common in chart.
The aim of doing this, is to show the difference of variance between groups. Therefore, if some of them are too long and some are to short, your data suffers heterogeneity of variance.
Very good advices above, but it leaves the essence of the question untouched. The CI is absolutly preferrable to the SE, but, however, both have the same basic meaing: the SE is just a 63%-CI. The SD, in contrast, has a different meaning. I suppose the question is about which "meaning" should be presented.
The SD is a property of the variable. It gives an impression of the range in which the values scatter (dispersion of the data). When this is important then show the SD.
THE SE/CI is a property of the estimation (for instance the mean). The (frequentistic) interpretation is that the given proportion of such intervals will include the "true" parameter value (for instance the mean). Only 5% of 95%-CIs will not include the "true" values. If you want to show the precision of the estimation then show the CI.
However, there is still a point to consider: Often, the estimates, for instance the group means, are actually not of particulat interest. Rather the differences between these means are the main subject of the investigation. Such differences (effects) are also estimates and they have their own SEs and CIs. Thus, showing the SEs or CIs of the groups indicates a measure of precision that is not relevant to the research question. The important thing to be shown here would be the differences/effects with their corresponding CIs. But this is very rarely done, unfortunately.
If you want to characterize the *population*, you should show the standard deviation, better the 2-fold standard deviation. This range covers approximately (roughly) 95% of the data one can expect in the population.
If you want to characterize the precision of the study, or if you want to characterize the certainty / uncertainty of the estimation of the mean in your study, you should use SEM or confidence interval (CI). I prefer 95%-CI because it is directly linked to p-values at 5% level. SEM is roughly the half of 95%-CI and is often "misused" to get the smallest error bars. So, when I see graphs mean +- SE in a clinical paper I'm always sceptical, and unfortunately, I'm often right... You can mask very small (and not relevant) study effects by showing mean +- SEM.
If the study effect refers to a difference, you should show estimate of difference with ist 95%-CI.
Thanks for asking and very good answers above. Both SD and SE are first principles in statistics, but very often confusedly report in publications. As SD is a measure of dispersion of the data it gives an idea about variability in the sampled population. Given that you have chosen an enough sample size, it will show the natural situation. In contrast, since SE is associated with your sample size (n) (as SD=SE/sqrt(n)), a greater sample will reduce the SE of the estimate. Therefore, SE is a measure of uncertainty in the data.
In tables, most of the time people indicate whether SE or SD is being reported (followed by ± mark), but very commonly it is not reported in figure legends. For instance, we can draw ellipses in a PCA biplot using either SE or SD, something that should be included in the caption. Same applies to any other case.
So whether to include SD or SE depends on what you want to show. IS it how uncertain the estimates are or its dispersion in the sampled population? But we should never let the reader to wonder whether we report SD or SE.