You cannot perform a t-test on distributions like this (non-gaussian and not equal variance etc) so perform a Mann-Whitney U-test... But, if the distributions over lap, i.e. have identical means/medians, they will show as not significantly different despite being bimodal/unimodal! Can you just describe them as being obviously different?
In such case a graph suffices to convince people, and one shouldn't belabor statistics. But if you really need to, a Kolmogorov Smirnov test would do the trick.
This is an interesting question. There will no doubt be countless answers here, but I thought I would give it some thought. A couple of options:
1. Decompose the bimodal distribution into the unimodal components. Then compare each unimodal distribution to the original (red line) distribution. Now that I write this, this seems awkward.
2. Compare folded distributions. This would tell you not the differences in the means (which you know to be the same) but rather how different they are in their variance. Fold each distribution around its mean and then compare the positive distributions. One will be strongly right-skewed and the other --derived from the bimodal distribution-- will appear like a normal distribution. I would compare with a modeling approach that allows specification of a Gamma error structure (positive continuous, so you need to add a very small amount to the zero values).
OK, this was my really awkward contribution of the day. No doubt there's a simpler approach than this...this is an interesting question, and I think this deserves some attention, even if it's only to get everyone thinking about distributions of their data.
In such case a graph suffices to convince people, and one shouldn't belabor statistics. But if you really need to, a Kolmogorov Smirnov test would do the trick.
The question is an artifact of a bad method. It has no answer as articulated. The question should be either: are the means of two samples of different shapes statistically different? [They are if z > critical z, irrespective of shape.] Or, are the probabilities different between the two distributions?
Remember that frequentist statisticians have a little trick they do to convert any shape into a normal one: assume lots of samples, and observe that the distribution of sampling means is always normal.
The last question is answered by a Bayesian analysis.
As J Michael Menke has pointed out, the question needs articulation. When you know the shapes to be obviously different, you would be better off with a qq-plot and K-S test
This is multivariate Gaussian distribution, you can consider as three normally distributed groups, and use parametric approach to compare these groups! However, check if the multivariate Gaussian distribution assumptions hold. If the distributions are exactly as the graph you attached, I suspect that it holds!
HTH
Michael
Try to use a statistic based on densities. For instance the AC statistics or otherwise based on a Lp measure (the L1 is usually the most powerful). (63. Martínez-Camblor P & de Uña Álvarez J, (2009) Non parametric k-sample tests: density functions vs. distribution functions. Computational Statistics & Data Analysis, 53(9), 3344-3357.)
In statistical testing such as in z or t-test we are only comparing means based on the spread or standard deviation. It will be best to convert the bimodal distribution into a normal one. If thats not working out then try a non-parametric test. I agree with Michael Menke above on this one.
If the question is if the two distributions are "different" I suggest the comparison be be based on a "distance" measure between the two distributions such as, for example, the relative information of one given the other. This relative information is only equal to zero when the distributions are identical. As suggested by one of the respondents already, many distributions have the same mean. Hence, the mean ought not be used to answer the question if distributions are different.
I would suggest further thought to the underlying cause of the bimodality.
Is there an unidentified covariate?
Is the dependent variable a "read-out" of a binary distribution?
There are multitudes of phenomena that could lead to such distirbutions. The problem as stated just lacks the context one would need to assess the merit of any approach. Any of these could be better or worse approximations for the hypothesis(es) that would be most appropriate for querying your theoretical model.
That being said, in the absence of any other information, I would tend toward the approach suggested wherein you extract the parameters from the component univariate distributions of the bimodal distribution and compare the three.
...but how to interpret?
With real-world context we might decide the folding method suggested was supperior...
Maybe the unimodal distribution IS composed of 2 modes, as well, but they're so close together as to obscure seeing it. If so, perhaps the hypothesis test would be the "interaction" wherein the distances between the derived unimodal distributions are compared between groups. The apparently unimodal may have a very small distance between its components, while the bimodal would have a larger distance...say if we were testing air flow around an object as measured by what happens behind the object. The unimodal may represent the air flow re-converging on the other side while the bimodal the flow does not converge as much. In this case, the individual comparisons to the middle may not be different, but that would be underestimating the real world effect.
This kind of problems can be attached using mixture distributions and possibly testing for the significance of the components.
See e.g.:
Mixture models : inference and applications to clustering / Geoffrey J. McLachlan, Kaye E. Basford 1988
Titterington, D. M. Statistical analysis of finite mixture distributions / D.M. Titterington, A.F. Smith, U.E. Makov 1985
I agree with the idea of presenting a graph of the data. I think that the reader should be convinced that the two distributions are different. I woudl be careful regarding the use of central tendency measures in the case of the bi modal distribution. In order to use something like this to describe each "sub-distribution", you have to make an assumption regarfing how to assign observations to one or the other mode. I wonder whether there is another (collected or yet to be collected) variable that would explain the apparent multi-modality suggested by the bi modal distribution. There may also be more than one group expressed in the unimodal group just not made obvious by the graph.
You must decide what do you like to compare: If You like to use tests, which operate with the sample means (testing the hypothesis about theoretical mean value differencies due to t-test) or with the ranks or sample medians (by non-parametrical testing hypothesis regarding identities of medians by Mann-Whitney), You cannot be satisfy. If You want to compare overal distributions, use two-sample Kolmogorov-Smirnov, how some our colleagues had recommend above.
That is why we use usually two or more alternative tests in our everyday bio-medical statistical practice (if it possible). You can argue, that this is not just only one nice methodical solution. But we usually have no time to prove the more detailed analysis of distributions at samples with more variables (at clinical research).
In your example "You know very well, what kinds of distributions You have to compare": one of your theoretical distributions has two different generating sources, without doubts. Wouldn't be better to separate them each other and then to test them toward the standard unimodal distributions?
If you have observations, you may also draw cumulative distribution function which will possibly show different in the middle. Then use K-S test. Before me, many researchers had provided good suggestions for you to consider. I need not add too much.
Thank you all for your answers. However they raise new questions!
In my actual study I have five volunteers, which have a before and after population which have n=250 independent measures in each. So essentially I have five, 2-sample, distributions - this is repeated for multiple parameters. Not all are the graphs display a bimodal and unimodal distribution together. Mostly the distributions are both unimodal and I have chosen to use the descriptive of median and a MWU-test (often skewed and unequal variance).
So for those within the 5 which has bimodal > unimodal shift (before>after) can you switch tests? I.e. "A MWU was performed on these 4, but I used a completely different test for this 1 (the K-S))".
This the begs the question; what is the cut-off for a bimodal distribution? I.e. how skewed does the data have to be before you can call it bimodal?
I have read a lot about this, and my understanding is that (as with normality tests) there is little to be assumed from a test and there is no test that will explicitly say 'bimodal!' or 'unimodal!'
I initially agreed with many of you in that (to quote Tsung Fei Khang) 'one shouldn't belabor statistics' and that presenting my data as a graph is enough. But as I near the end of my PhD I am worried that this argument of basically assuming they're different by eye will not hold in my viva!!
Thanks once again for all your answers!
Libby.
Let me understand: Do you have 5 subjects and are asking 250 questions? If this is true then I think that we are really talking about a very small sample in a very large number of dimensions. My sense of things is that you probably need more data. and I would also start thinking about dimension reduction.
I am trying to characterise a complex protein. I measure distances and angles between parts of the protein and the mass of parts of the protein. I measure these parameters 250 times to get a solid characterisation. I should really have measured n=500 or n=1000, but 250 was all I got in the time period.
If we look at the graph, then there are two populations were compared. One normal and the other is bimodal. Bimodal population contain multiplicative variance. How bimodal distribution trends? Maybe binomial, poisson, or other suitable distribution. If the distribution is known, then the completion of the comparison can use generalized linear models. This model uses the canonical link function between the normal distribution function with the non normal distribution function.
250 should be enough sample size. You need to first decide what is your objective: do you want to test (1) if there is a change in the central location of the distributions OR (2) if there is a change in the distributions. Basically, do you consider the change important if the median does not change? With a clear objective, you will use t-test or MWU for testing (1), and will use K-S for testing (2). And you would use the same test for all five individuals.
As said by Tsung, a Kolmogorov Smirnov test seems to do the trick if samples large enough.
Under R, we have the following example:
1) Generating 100 standard gaussian observations
x1=rnorm(100)
2) Generating 100 observations of a mixture of normal distributions centred respectively on 1 and -1
with standard deviation 0.5 for each component:
x2=c(rnorm(50,-1,.5),rnorm(50,1,.5))
3) T- test result with "t.test" function :
t = -1.0485, p-value = 0.2957
4) Wilcoxon rank sum test result with "wilcox.test" function:
W = 4603, p-value = 0.3326
5) Two-sample Kolmogorov-Smirnov test result with "ks.test" function:
D = 0.24, p-value = 0.006302
Thanks Dwight, I like that quote! I don't think my examiners are grumpy ole profs (at least I chose them because they're not!) but generally statistics in biological sciences is very outdated and rigid. People want "if x is y then do z, if not do zz" and the more I'm reading this just isn't the case.... ever!
I've found a lot of interesting discussion points in 'Intuitive Biostatistics, Motulsky', with updated versions online; http://cdn.graphpad.com/docs/prism/6/Prism-6-Statistics-Guide.pdf
I think I have the gist of what I need to do now, but I need to make sure I can defend myself in my viva! Trying to do this firstly by writing a detailed 'Statistical Methods' section in my materials/methods of thesis - but as a non-statistician, this is hard!
Thanks everyone for the comments!
A suitable way could be the normalization of the distributions before their comparison.
Good Luck!
Yeah, I agree with the original comment. So if you want to do some sort of statistical t-test or eqivalent, then you are looking to prove a hypothesis or apply some sort of decision theory. Given the graph that you have, there is just too much density in common between the two distributions--so you won't get any statistically significant result over the entire distribution,
In this case, I would probably think about reframing how I am testing my hypothesis or reparameterizing. You want to really make sure to explain whether the whole distribution is important or whether there are just parts of the distribution support that are important. I suppose if you really want a mathematical test, then you could look at the confidence intervals for particular quintiles or quantiles of the data. So if you look at the chart above, you will find that there is a lot of overlap in the CDF for the first quintile or quantile, but then for the second, third, and forth quintile--the CDF of the bimodal distribution will be significantly less than the equivalent quintiles of the gaussian distribution. So that would be one way to demonstrate that the data is definitely not unimodal like a gaussian or t distribution.
You can compare the result using likelihood test or find a better estimation using good-ness of fit using AIC criteria.
The use of generalized linear models also have involved the use of parameter estimation based on maximum likelihood (ML) or restricted maximum likelihood (REML). In breeding is more common to use this approach, which is called the Best Linear Unbiased Prediction (BLUP). Generalized linear models assume that this genotype is random so that the model is generalized linear mixed models. On cross population, the population is composed of sub-population. There are sub-populations that have homogeneous (unimodal/normal), and there are still heterogeneous (bimodal, trimodal, etc.). This situation is no different with the issues being discussed. For it issues, if the two populations being compared are fixed, this approach can still be done, because in this kind of generalized linear models, epsilon still behaves randomly.
All these answers are good; the graphical characterization should be sufficient if the difference is "obvious" enough (with n = 250, it's unlikely you would see bimodality as an artifact of sampling variation). Kolmogorov-Smirnov should clinch it if a hypothesis test is required to clear an academic hurdle. One thing you haven't touched on is *why* your second sample has a bimodal distribution. Typically one would think this reflects the fact that the sample is from a population with two subpopulations, with differing means. This in turn might be a consequence of some scientific hypothesis--if so, I imagine it might be more meaningful to a biological audience than the result of a statistical test, or a mere empirical observation that the plots were different.
One possibility is to compare the quantiles. For recent results on this approach , see
Wilcox, R. R., Erceg-Hurn, D., Clark, F. & Carlson, M. (2013). Comparing two independent groups via the lower and upper quantiles. Journal of Statistical Computation and Simulation. DOI: 10.1080/00949655.2012.754026
You can use the R function qcomhd, which is in the R package stored on my web page: Dornsife.usc.edu/cf/labs/wilcox/wilcox-faculty-display.cfm
The Mann-Whitney test is based on an estimate of P(X
Others have essentially captured this indirectly, but simply speaking, conventional tests for a shift in mean (t-test, Kruskall Wallis etc.) require the assumption that the data are drawn from distributions with the same shape and variance--then a significant test statistic provides an inference to a shift in mean. Bimodality is just one example of differences in shape or variance that preclude conventional two sample tests.
Conversely, if both distributions were bi-model with the same shapes and equal variances then the K-W test for a shift is perfectly applicable--Although the conditions for this seem remote.
When these tests are applied indiscriminately to situations with unequal variances or different shapes, a rejection simply infers that the distributions differ, either in shape, variance or central tendency, or some combination of all three. In this situation the nominal alpha level is generally unknown and thought to be inflated more often than not.
Kolmogorov's test works. But a EDA would works well too in order to show you have two distinct distributions.
In my opinion a non parametric approach ist enought (Mann Withney as a particular case of Kruskal Wallis Test. I have worked in the past with the age of myasthenia (a bimodal distribution) by using this simple approach.
you have to understand why there is this bi-modal distribution. as David Collins answered before.ex: suppose you have the same mean between the two distributions......but you have a disease that incides in a young population....and in na old population.......if take the mean...the age affected by the disease will be the middle age,....
It is important to understand what the aim of a test is: Welch's t-test could be applied to unequal variance measurements to compare the mean of two groups, but that is probably not what you are interested in. Wilcoxon's rank-sum test (=Mann-Whitney U test) can be applied to compare any two distribution for equality, but it detects only certain types of inequalities (and it is quite complex to understand which). I don't expect Wilcoxon's test to be able to detect a pure bi-modality vs. uni-modality difference. The Kolmogorov test studies the maximum difference between the cumulative distributions, and that is probably what you are looking for, as the cumulative distributions for these groups are obviously wildly different.
Glenn Jones has aptly stated the distributions shown are smooth and so we could assume large sample sizes to generate these density functions. The bi-modal distribution certainly has two peaks that are statistically significant from one another, In contrast, the other distribution is uni-modal and may be even Normal. The means may be the same, but the variances will not be the same, and that difference may be statistically significant. The bi-modal plot is surely not Normal, so T-test of the means is simply not an appropriate test.
It seems to me that there are several areas of confusion at the heart of this question.
First, if these are the true distributions then you don't need a statistical test to compare. They are obviously different. So I can only conclude that these are estimated (or empirical) distributions. If so then they may both be representations of the same underlying distribution. A K-S test would be a good way of determining this.
Secondly, you can use parametric tests on non-gaussian, heteroscedastic data, as long as the sample size is sufficiently large relative to the deviations of the data from the 'ideal'. Asymmetry is probably the biggest threat to many of these tests, even to non-parametric tests like M-W (which also assumes equal distributions by the way).
Thirdly, the tests you refer to (t-test, Mann Whitney) only test a difference of location. They do not address the question of whether the distributions are the same.
The difference here seems obvious. We should however bear in mind that the use of statistical test is inextricably linked to the context and objective of the study. In some study contexts, one might be more interested in the cumulative distribution of samples and not the original/raw distributions. If that is case with your study, then Kolmogorov Smirnov can be of help. I would not like to dwell on other tests for pair comparison because you seem more interested in comparing distribution. If at tall you are interested in the comparing the raw/original distribution, Mann-Whitney U test can serve the purpose because it tests the hypothesis that two sampled populations are equivalent in location and the difference will obviously be significant in your study context.
@John Duffy. Could you clarify your position about the ""silliness" of some of the answers? Do you have something constructive to add to this discussion?
If data were mine, and example could be from zoology, first of all, I presented graphic, as best way to tell distributions are different. And added Kolmogorov-Smirnov's D, or Mann-Whitney's U. In most packages they are under "nonparametric tests". And, this may save you from nasty questions.
But then, I'd be happy to know second sample may be decomposed to the two parts - species, countries etc, and these should be characterised as "distinct". So the most statistical concern wouls be about the second, bimodal sample.
Now you are doing with protein. As analogy to zoological study of populations, second graph would imply, that resuats are from two species/populations, so really you present here not two, but three different samples.
@ J. Patrick Kelley Since the question is insufficiently specified the answers reflect speculation about what the questioner really wants to know. Stats isn't about recipes for significance tests. Owen Bodger's answer is not silly - I didn't say they all were....
@JohnDuffy. Nor did I claim that you thought all were silly. My apologies if there was some confusion on this matter. First, I couldn't agree more that "Stats isn't about recipes for significance tests." I doubt many here will argue with you. There were, in my opinion, very few silly ideas here. They may have offered only partial solutions, but they contributed several different approaches. To me, this suggests that most people don't view statistics as recipe boxes.
The questioner certainly misstated the primary question (rather than asking how to determine if these two distributions are significantly different...statistically). I assume this was your primary objection, and rightly so. However, she clarified her point in the underlying description, allowing contributors to clearly understand--without speculation--about her intent. This was evinced by the questioner's response ("I have the gist of what I need to do now..."), suggesting that the answers provided her with some fodder for thought. In any discussion, it's a good thing for the questioner to walk away with a set of answers on which to build more ideas.
Just like many others, I have offered a solution to the questioner's problem. Judging from the number of up-votes, this clearly was a bird-brained idea. But that's alright in my opinion. I thought about this interesting problem and at least contributed something (again, bird-brains!) to the institutional memory.
Perhaps you would like to offer a solution of your own to the original poster's question?
At the outset, I would like to apologize if I misunderstood the problem and the associated GREAT discussions. In the following let me try to better understand the problem before giving you the solution.
1. The General Statistical Problem:
Group 1: X1,…….,Xn i.i.d random sample from distribution F
Group 2: Y1,……..,Ym i.i.d random sample from distribution G
Where F,G can be both discrete/continuous. Goal is to compare the two distributions F and G - Two Sample Problem, without assuming any parametric model before even looking at the data(completely nonparametrically). This problem arises almost any field of science yet unfortunately one of the most neglected topic in the standard text books. You can even judge the quality of an introductory textbook of statistics looking at the chapter of two-sample – most of them start with t-test and end with Wilcoxon/ Mann-Whitney, without answering the real question that all scientific investigators want to know.
2. What we want to learn?
(a) Firstly, we want to know whether H0 is true, that is whether the two distributions F and G same.
(b) Second, if they are not same we like to know HOW the two distributions are different. Is there mean shift/contrasting variance. What distributional characteristic is different among these two distributions?
(c) How to construct a nonlinear statistical measure (as a form of component correlations) to quantify distributional difference? Effect size estimation.
3. Key concepts, which can be easily incorporated into introductory statistics courses: Mid-distribution transformation, comparison density, LP score function, LP moment and comoments.
4. Algorithm and Final Comments
(a) Pioneered by Emanuel Parzen (1979,1983) and further developed by Mukhopadhay and Parzen (2011,2012,2013a,2013b).
(b) Source: First five section of http://arxiv.org/abs/1112.3373
(c) Unified Algorithm: Our method unify discrete & continuous; parametric and non-parametric; small and big data. Recent development http://arxiv.org/pdf/1308.0641.pdf
(d) Third Culture of Statistics: Nonparametric Exploratory Information Theoretic Modeling (for other two cultures which are parametric confirmatory and algorithmic see Leo Briman 2001 “Statistical Modeling: The Two Cultures” and discussion there by Parzen (2001)).
(e) Applicability: Few lines of R code can do all of these. If interested let me know, I would be happy to share with you.
Thank you,
Best
Deep
The graph attached to the original question appears quite suggestive to me. One could use a likelihood ratio test comparing a normal to mixture between two normal distributions.
But can we do it similarly without relying to much on the shape ?
And 2ndly @Elizabeth: Are there biological reasons to assume that both components of the mixture have
- equal size
- equal standard deviation
- standard deviation even equal to the SD of the other group, and finally
- have the same overall mean as the comparison group?
If so, substantial gain in power can be expected as compared to the Kolomogorov-Smirnov test.
From the law of large numbers
Lim_{n-->infinite} Prob(abs(fn(data)-fn(reference))>e =0, that is for large n probability that the difference between the measured p.d.f and a reference one is larger than some small value e is zero. Hence it makes sense to compare the two distributions when n is large enough.
Now suppose that you data are generated for large n. In that case, your bimodal distribution can be compared very well with a function that is convolution of two unimodal functions.
Beyond the Kolmogorov-Smirnov test, I suggest also comparison in terms of first (second, ...) order stochastic dominance.
The basics of difference in two datasets are the proportion of the probability distribution function they have in common. In such case I see that the area of two graphs are not equal.
For example, in two tailed Z test for two sample with known population variance threshold of significance at 0.05 level is Z=1.96. As you see in the picture, there is about 30% of area in common between two distributions for 0.05 significance. So, if the area is more than 30%, there is no difference, regardless of distribution function.
Bi-modality is often the result of having two populations mixed up with each other. The Chi-square is usally the best answer to determine if two distributions are alike or not.
I agree with Jeff Jarret. You know? Some questions are sometimes non sense. in a bimodal sample we should consider the basic method of sampling and whether the sample has taken into account it's assumptions, such as uniformity of population.