I am researching on football merchandise sales on the Zimbabwean market and it is impractical to determine the actual numbers of consumers on the market. So how does one calculate sample size under those circumstances?
It depends on what you are trying to measure. Perhaps you could treat stores (outlets?) as sampling units, and consider cluster sampling.
You could look for information on cluster sampling (probably with a census of each cluster which is randomly selected) on the internet, and in books such as Cochran, W.G(1977), Sampling Techniques, 3rd ed., John Wiley & Sons.
Of course, I may have not understood your question.
Anyway, as in all cases of estimating sample size, you want to control for bias, reduced to hopefully a negligible amount, and remember though that bias is still there and perhaps not so small as you'd hope. And then you see what sample size will give you the standard error/confidence interval you'd want in the case of quantitative data. (I'd guess variability in qualitative data to be similarly important.) For all but simple random sampling (and a model method I have, which is not likely applicable here), that could get quite involved.
I suggest that, if practical, you select some clusters at random to census for a pilot study, and see how the standard error works out to estimate the sample size of clusters that you will need in the full study. (Say an outlet would be a cluster.) A pilot study might also help you work out other issues of logistics and data collection.
Because clusters are not likely of equal size, this can be messier than simple random sampling, even when you census each cluster/outlet. Best if you consult a good statistics book on this.
It depends on what you want to measure and why, as well as what you want to do with the result.
To determine the necessary sample size for hypothesis testing, four factors are usually taken into account. These four factors are: criterion for statistical significance, level of statistical power, statistical analysis strategy, and the size of an effect judged to be meaningful.
The convention in marketing research (my discipline) is to use the 0.05 level of significance. However, if the effect is large, then a lower level of significance is possible. Increasing sample size does not necessarily mean that you decrease the potential for error because large samples can produce spurious associations.
A second factor that affects the number of respondents needed in hypothesis testing research is statistical power. Statistical power measures the likelihood of a Type II error occurring. Type II errors are when the null hypothesis is accepted when in fact it should be rejected. That is, a relationship may actually exist but it is not observed.
Usually statistical power should be no less than .70 although some indicate that it is possible to have a range as low as .50. A sample size of 200 would be sufficient to detect moderate effects and have a statistical power of .998 at the .05 level of significance.
The third consideration when assessing the sample size is effect size. If a prior study indicates that there is a large difference between population means, then the sample size can be decreased, as only a few subjects will be needed to detect a difference. Conversely, if the difference between populations is small then a large number of subjects will be needed to establish that the difference is real. If you intend to generalise from one population to another, you will need larger samples. If you only intend to generalise to your target population, then you need a relatively smaller sample (say 30 which is the minimum to infer the principles of normality can be applied).
A fourth factor affecting the size of the sample is the data analysis procedure. For more complex factorial designs you need to have a total sample size in excess of 200 for sufficient reliability to be attained. Sample sizes for multivariate analysis should be at least ten times more than the number of variables in the study. However, analysing each group of variables in a step-wise manner means that you can constrain variables and you then do not need to have as many responses analysed in each component of the analysis.
My rule of thumb to decide is what is the weight of the decision? If my client is going to spend or lose millions based on my statistical prediction, I will be as sure as I can be that this is a 'real' figure (read Borsboom on measurement). It will cost money and time to be sure. If this is exploration of theory, then somewhere between 30 and 200 is 'enough' to tell you about a particular population.
However, I argue that some considered extrapolation will enable you to estimate the population of football fans in Zimbabwe and therefore infer a number for those who may be considered potential or actual consumers of merchandise. If this is a market size estimate, then this will provide you with the parameters you need to decide how many people you actually want to research.
If you look at formulas for calculating the standard error associated with a given sample size, you will find that the size of the population actually has relatively little influence, especially once you get into populations that number in the thousands.
I suggest you follow the advice to use an online tool for calculating the power of your sample size and try using a variety of population sizes. I suspect you will get rather consistent estimates of the power despite those differences in population size.