Tricky question because common sense says the larger the sample the better or the closer you'd be to the population size. Right? Only if the sample was take randomly then there are no disadvantages to how big it is. Random samples mean that they are as representative of the population as possible. So if you have a large sample but it is biased towards certain phenomena then that's a bad thing. In conclusion, there's no such a thing as maximum sample size, the key is to have a representative sample size that's larger or equal to the minimum.
Almost only outcome of the statistics (p value) can easily be reduced by increasing the sample size. Due to my observations, the "maximum" is around 200. if your sample size is greater than 200, due to frequentist statistics, everything will be perfect and almost all p values become smaller than 0.05, irrespective of the test you are running or the nature of the data you are testing...
Statistics as mathematics: The answer to the question is no. Increasing the sample size improves the fit between the sample and the underlying distribution.
Statistics as a tool: The answer to the question is always yes.
1) Ethical considerations, where every replicate represents another mouse/rat/pig/monkey/human.
2) Economic considerations, where it costs too much to have a sample size of 50,000,000.
3) Time constraints, where you do not have 20 years to do 50,000,000 replicates to get one manuscript. You were fired after the third year for being non-productive.
4) Rare and endangered species, critical habitat: The population of a rare butterfly is 783. I do a genetic study and kill all 783 individuals. Not good.
5) Risk: I am surveying non-combatants in a war zone. I have collected 30 surveys, but five assistants have been killed. I really need 20,000 surveys, so I need to find more assistants. Not good.
6) Usefulness. A huge sample size will find statistically significant effects that are too small to be useful given the current state of knowledge. I get a p-value of 0.00000001 and an r-squared value of 0.003.
However, there is a huge cost to having a sample size that is too small.
I have trouble with some parts of the reference that Tom posted. I suggest that there are times that the null-hypothesis is true, or so close to true that it doesn't matter.
I run an insecticide trial. I test 10 larvae in a control and 10 larvae in a treatment where each larva is exposed to 109 molecules of the insecticide. I find a statistically significant difference. I decrease the dose, and it takes more larvae to find a statistically significant difference between the treatment and control. This goes on, and at 10 molecules I find that it takes 1017 larvae to get a significant difference. The problem is that there are only 1014 larvae of this species in the entire world. So even if I used every single individual, I would still not have a large enough sample to find a significant difference.
With some risk, you can play this game with smaller sample sizes. Say I run an experiment to test the behavioral differences between males and females. I test 50 insects of each sex. I find no difference, but I estimate it would take 200 individuals of each to detect a significant difference. The largest published sample size to date is 30, and many experiments use 10 or 15. Given current technology, no one will ever run an experiment with a sample size of 200 or more. Do all experimental designs have to include sex as a treatment even when it is already known that the sample size will be insufficient to reliably identify a sex effect?
Dear all, thank you for your close interest and affordable responses. Is there anything to say about relationship between large samples and OVERIDENTIFICATION problem in statistical modeling? As a simple introductiory example, we need to adjust R_square in a simple regression model against large sized samples.
I did not think correlation improved with increasing sample size. This would imply that everything is correlated with everything else if sample size is infinite.
Finding significant outcomes that are so small as to be practically meaningless requires a quantification of "meaningful." I suspect that you have exceeded the sample size necessary to answer the original question by the time you can quantify meaningful. In some cases there are generic rules like a difference less than 30% is meaningless because it cannot be detected above background variability. I have never seen a good proof that such is the case. However, in theory, if you can demonstrate that an effect below "this level" will have no effect (economic, social, ecological, mechanical, etc...) then sample sizes above "this number" will be excessive. With enough information you could develop a simulation that would enable calculation of a maximum useful sample size.
@Mike, No. It would be a census. So the maximum possible sample size is population size -1. Seriously, you left out one individual just so it was technically a sample???? Statistics gone to the dark side.
A sample is representative of the population. A study of the parameters of the CENSUS of a country assumes the largest sample size for that population....the MAXIMUM SAMPLE.
Mike, I was too, kind of. Some RG posts give population sizes of a few hundred. This would be possible to census. However, one could deliberately sample one fewer than the population to get a sample. I don't know of anyone who has actually done this, but on these data it would then be fun to analyze the data using a leave-one-out strategy. I thought it was funny, but most will probably just think it strange.
Dr Ebert' 6th point is well to keep in mind. When working with our fisheries in California, the real question wasn't "Are these fish populations the same from year to year", because we know that they are not the same. There is loss due to fishing and natural mortality and recruitment from natural reproduction. The real question was "are differences large enough to require changes in management". The shame was that I didn't realize that the actual question was the second and not the first until well into my career.
A large sample size does not guarantee representativeness.
A large sample could be affected by self-selection bias.
A notorious example is the 1936 Literary Digest poll that failed to predict the future USA president despite more than 2 million of people answered back to the poll. The inaccurate prediction was due to sample composition bias (respondents were wealthier, unsatisfied with the current president and more motivated to respond to the poll) resulting in an unrepresentative sample.
Only samples selected by strict random probability methods can provide sufficient guarantee in terms of representativeness, and, actually, only random samples can be tested in order to assess if they are representative of the target population.
Said that, the random selection does not mean automatically that the sample is representative straight away. We need also to consider the response process and the following non-response bias in a sample.
In addition, representativeness cannot be a property of the sample as a whole.
We can state representativeness just for the variables whose distributions are previously known in the population. These variables represent only a limited set of properties (generally, age, gender, education), often also hardly comparable.