Yes its better to have equal numbers in all the groups. Coming to the question of minimum sample size, there cannot be a benchmark. It depends on a number of factors like prevalence of disease, alpha error, beta error, allowable error and many more. Sample size is calculated on the basis of these factors using a suitable formula.
For statistics each group should have minimum of four values, but according to the individual groups there is no restriction of to have equal numbers....... It can vary among the groups but it should be more than four values to calculate ANOVA or so on...........
Failure to reject the null hypothesis is not proof that the null hypothesis is true.
Step 1) prove that your data are normally distributed. Prove that the underling distribution is normal, and not approximately normal, or normal in theory, or that the Central Limit Theorem should insure that it is normal.
Step 2) Insure that there will be no missing values.
If you can satisfy the above two steps, then four values is the typical answer. However, be warned that satisfying step 1 is more work than your original project.
Except for very unusual circumstances, you will have a non-zero failure rate. Part of the field gets flooded and one rep has to be discarded. The power fluctuates and a rep must be discarded. The list goes on and on. So the very least is 4+ where a few additional reps are included to cover these problems.
The answer depends a bit on they type of experiment. So I could do a dose-response assay with 4 reps and 15 different doses. This might be a good plan if I know nothing about how toxic the material is to the target organism. However, if the analysis is going to be a mean comparison procedure, then four replicates gives you little recourse should things not go your way.
If you know the underlying distribution, the variability in the data, and have a minimum effect size, then there are methods for calculating sample size. On line sample size calculators like GPower will do this (search using Google). If you do not have all this information and there are no established sample sizes based on literature, then you should ask yourself a few questions:
1) If this fails, what will I do? With a sample size of 4, a failure will usually mean start over or abandon this task. It is also possible with a sample size of four, you will be requested to gather more data before the results will be accepted even if the results look good to you.
2) Are there mitigating circumstances? I am working with a rare organism, using highly toxic and expensive materials, with 42 treatments, and overall with great risk to my personal health. Reviewers of such a manuscript might figure that the knowledge gained from this difficult and expensive trial are worth some statistical issues.
3) Are there ethical issues? Animal cruelty or use of human subjects? Major environmental impacts?
4) If 2 and 3 do not apply, then consider the impact of the results and how controversial they will be if things go your way. I test DDT against honeybees and find that the insecticide DDT is able to kill honeybees. No one will be surprised. and this is not much of an issue. Four replicates is great. I test imidacloprid against honeybees and show that there is no effect. No one will believe you because imidacloprid is an insecticide that prior research has shown to have an effect on bees. In this case, 4 replicates will be grossly insufficient to get anyone to take your results seriously.
I hope that it is apparent that there is no simple answer to your question. I suppose that this could be a trick question. If by basic statistics you mean the mean and variance, then the answer is two because this is the first point satisfying the requirement (n-1)>0 in calculating variance (assuming Normal distribution). However, just because one can does not mean one should. To get a better answer you need to define the basic statistical method, and give some idea as to the application.
With no other information, I will suggest 10,000 replicates.
I am not of the opinion that a statistical analysis must end with testing some hypotheses. A well-done exploratory analysis for reaserch experiments is often much more worth than the usual testing procedure. So "alpha" and "beta" don't seem do be too much of a concern here, IMHO. Neither are the questions whether or not one is able to reject or accept some hypotheses.
Much more of interest is: How large are the effects? How large should they be? Given you have some expectation about that and the variance of the data you can calculate how many sample you would need to measure to achieve a sufficient precision of the effect-estimates.
An a-priori sample size with a medium effect size (0.5), power (0.8), and probability (0.05) will be 51 for one-tailed hypothesis and 64 for two-tailed hypothesis.
Sample size is a very important issue in statistics, but there is a lot of confusion about it.
In a normal perfect randomize sample (n) from a big population, the sample should mimic the summary statistics of the larger population independently of sample size. So, the computed value for the expected sample mean or the expected standard deviation is not systematically affected by the sample size, since mean and standard deviation insample should imperfectly mimic the truth, but unknown value of mean (m) and standard deviation (s) in the larger population.
However, what you are improving when increasing n are the precision of data on computing the sample mean (m) and sample standard deviation (S).
So in order to achieve the desired precision level, you must consider the sample standard (S) deviation of mean. If you have a large S, you need to increase the sample size to achieve the same precision on your statistics.
The relationship between S , standard error (SE) and sample size (n) is:
SE = SD/ √n
So now it’s very easy to compute the sample size considering the precision level that we require
I think that sample size is largely dependent from your area and task. In biology, for example, you have also must have at least 3 independent biological repetition, The samples number may also be dependent from variations in your population. As general rules for biology, you need to have at least 3 biologically independent repetition and ar least 10 "objects" in each population..
I completely agree with Jochen Wilhelm. You can have a significant difference but it does not mean anything because the differences are negligible. Here is a good article that goes into more detail and questions the scientific merit of the P value:
When comparing two populations by mean difference, mean ratios or Odds, t-test, etc., we are considering confidence intervals (CI),that usually are 95%. This (CI) can be large or small, just depending on sample variability (sample variance/standard deviation). For this reason if the sample has a large variability, then the expected sample mean can be far away of the truth (truth= mean of population) and then the p value may show some volatility on repeated test if the sample size is not enough to reach the level of precision
It is very important to understand that p value is NOT
a) The probability that the null hypothesis is true
b) The probability that the study was well conducted
c) The probability that the results of the study are important
d) The probability that the HA is true
e) The probability that the study findings are legitimate
p value is the probability of getting a mean difference of samples, as far or more extreme than the expected value of means difference if the null hypothesis is the truth
Source of sample variability is an important issue, but I guess its out of this discussion
Thanks to Nitin Gupta, Durairaj Ragu Varman, Timothy A Ebert, Jochen Wilhelm, Abdolghani Abdollahimohammad, Taras Pasternak, Christian Q. Scheckhuber and Ramon Fernandez-Pinilla for answering my question. Especially to Tomothy whos answer clarified me many things. Thanks again.