What is the best statistical tool to use to test a research hypothes of two population groups that vary in sizes significantly? E.g. Sample A has 300 respondents while B has 30 respondents.
What IS your hypothesis about these two groups in the first place?!?! Kruskal Wallis is a general version of the Mann Whitney U test for more than two groups, so it basically tests stochastical superiority. Is this what you want?!?!
The choice of a test depends on the data generating process (so the conditional distribution of the data), the assumptions of the test and and the hypthesis you want to test. Therefore, to give an educated answer to your question, we need more information.
Oluwafemi Samson Balogun sorry but this is nonesense in several ways:
1. there is no such thing as parametric or non-parametric data! This statement is about the tests.
2. Without knowing the hypothesis of the OP a suggestion does no make much sense.
3. Even IF he wants to compare means for example, a MW-U test cant do it (without further assumptions) and for a z-test you need the population standard deviation sigma (wich is seldomly known) instead of the sample deviation. If you only have the letter one, a t-test might be used IF assumptions are met, otherwise alternatives may be considered depending on the violations (Bootstrapping, Welch test, Yuen t-test...etc).
So, as stated above, we need more information to make an educated suggestion.
In the context of this conversation, Rainer, I concur with your assertion that the concept of parametric or non-parametric data is invalid, as it is more appropriate to refer to parametric or non-parametric tests. However, have you thoroughly comprehended the content of my speech in order to grasp its intended meaning?
Furthermore, it is acknowledged that additional information is necessary with respect to the hypothesis.
Contrary to the assertion made by you, it is incorrect to state that the z-test may only be employed when the population standard deviation (sigma) is known. In fact, the z-test can be appropriately utilized even when the value of sigma is unknown, particularly when the sample size is equal to or exceeds 30, as previously noted by the author.
Yes, I did thoroughly consider the intended meaning of your speech. But maybe you can clarify your intention.
My point is: why making suggetions with sparse information at hand? To suggest a z-test, you need to know that the data generating process generates normally distributed data, with know sigma and homogenous variances. The claim that a z-test may be used in case of homogeneity may hold, if normality can be assumed. But what if the data is bounded, e.g. a gamma distribution with a very small scale parameter? There are more appropriate solutions for this. With heavy tailed or very skewed distribution, you cannot rely on the n=30 rule and the central limit theorem (you do not know when it will converge, see publications on robust estimation by Rand R. Wilcox for example).
Hello Joseph Ayi Otu , if the purpose of your hypotheses is to test for the statistical difference between the two sample groups, then I’d suggest you run a two-sample T-test.
A two-sample t-test is a statistical method used to compare the means of two independent groups to determine if there is a significant difference between them. It assesses whether the observed differences between the groups' sample means are likely to reflect true differences in the population means.
You can do this via Spss or any other quantitative data analytical software.