All common sample size laws suppose we have only one variable (e.g. Tohmas thempson), but what if we have so many variables, as in the normal questionnaires, how to apply such these laws when we have to measure so many variables in the survey?
There are two methods for multivariable allocation and multivariable sampling.
Efficient Balanced Sampling: The Cube Method by Deville and Tille
http://www.jstor.org/stable/20441151 There is R-software for this. Does not allow direct variance estimation. Uses Martingales. At one point I was familiar with their method. Creecy and Klein used some of the software in a controlled rounding attempt for one of their applications (generating synthetic data that rounded to Census published tables - they could generate the synthetic data but not get it to round - my methods (below) give some insight into the difficulty of the problem which I was able to overcome even when it is known that no general controlled rounding algorithms in 3+ dimensions can exist).
Tille also has an excellent book Sampling Algorithms that a number of us have been through. The book gives an exceptionally strong presentation on several algorithms.
http://www.census.gov/srd/papers/pdf/rrs2009-08.pdf Allows direct variance estimation. Also gives methods for controlled rounding (among the four components of the method that I cover). Extends the Cox, Causey, Ernst (JASA 1985) from two dimensions to three or more. Controlled rounding software is exceptionally difficult to write.
It is useful/necessary to prioritize. i would decide the sample size based on the most important aspect of the survey. I would also keep in mind what I can afford. If there is still margin left, after taking care of the key objective, then i will consider the second priority and see what is the sample size needed and increase it if necessary.
There are two methods for multivariable allocation and multivariable sampling.
Efficient Balanced Sampling: The Cube Method by Deville and Tille
http://www.jstor.org/stable/20441151 There is R-software for this. Does not allow direct variance estimation. Uses Martingales. At one point I was familiar with their method. Creecy and Klein used some of the software in a controlled rounding attempt for one of their applications (generating synthetic data that rounded to Census published tables - they could generate the synthetic data but not get it to round - my methods (below) give some insight into the difficulty of the problem which I was able to overcome even when it is known that no general controlled rounding algorithms in 3+ dimensions can exist).
Tille also has an excellent book Sampling Algorithms that a number of us have been through. The book gives an exceptionally strong presentation on several algorithms.
http://www.census.gov/srd/papers/pdf/rrs2009-08.pdf Allows direct variance estimation. Also gives methods for controlled rounding (among the four components of the method that I cover). Extends the Cox, Causey, Ernst (JASA 1985) from two dimensions to three or more. Controlled rounding software is exceptionally difficult to write.
You would be well advised to do sample sizes for a range of scenarios. What variables have the highest and lowest prevalences? What population subgroups have the lowest prevalences? What are the questions that could produce comparisons in which there would be small numbers of participants in a cell of a table?
These will give you some idea of the analytic potential of a particular sample size. In the end, though, your sample will have one size, so it won't be ideal for everything. You can use these scenarios to figure out what questions aren't worth asking or what questions are beyond analysis because of low frequencies expected.
If you wish to control the sample size for a single variable, then stratification can help. If you want to do three or more variables at once then you need methods that can produce efficient stratification for several variables at once and that has theory that allows minimizing the sample size for several variables. The methods that I listed can do that.
Anil's approach is not bad, regardless of what probability of selection/design-based, prediction/model-based, or model-assisted design-based methodology you use, and even good guidance for qualitative studies as well. Bill gives more rigorous options, the first paper using auxiliary data, and the second a multi-attribute Neyman allocation, which requires more expertise.
I used model-based sampling and prediction for small sample sizes from many small populations, for official statistics, and each attribute had its own model weights, so there was not such a problem as having a sample selection probability that is good for the estimator for one data item and terrible for another.
Regardless, you need some preliminary idea as to standard deviations. Cochran, W.G(1977), Sampling Techniques, 3rd ed., John Wiley & Sons, is one of a number of good sampling books, and in your case the recommendation for obtaining 'estimates' for these sigmas might be a pilot study.
A pilot study has the advantage of helping you to work out details before you fully commit yourself, and might be helpful here, especially if you have a fairly extensive project.
One more reference you might want to see if you have auxiliary date to guide a probability of selection-based study:
Holmberg, A.(2007), “Using Unequal Probability Sampling in Business
Surveys to Limit Anticipated Variances of Regression Estimators,”
Proceedings of the Third International Conference on Establishment
Surveys (Montreal, Quebec, Canada), American Statistical Association,
Of course you need to consider the kind of data you are collecting. One trap I have seen is that many people run across sample size 'calculators' on the internet, which are not clearly marked that they only apply to yes/no data, usually assume the worst proportion in lieu of estimating sigma, and generally ignore the finite population correction (fpc) factor, so that they may even recommend a sample size that is larger than your population! If you don't use something relevant to your data for one question/attribute, you cannot get anywhere with more than one question.
The kind of data you are collecting, and your goals are important. You might want to ask another question and include more specifics. (You mentioned "Tohmas thempson." Maybe that was a hint at your application, but I don't understand. Did you mean "Horvitz-Thompson?")
If you are looking at likert-scale data, as many on ResearchGate seem to do, then there may be some texts specifically on them, but not my area - I don't know any.
I would like to add to the list of suitable methods suggested by William Winkler also the one implemented in "SamplingStrata" R package.
It is a method that allows to optimise the stratification of a given population frame, together with the allocation of units in the resulting strata, given precisione constraints set on a number of target variables. It is also possible to differentiate precision constraints in different domains of interest.
The method and the software is described in the paper:
Giulio Barcaroli (2014). SamplingStrata: An R Package for the Optimization of Stratified Sampling. Journal of Statistical Software, 61(4), 1-24. URL http://www.jstatsoft.org/v61/i04/.
"I used model-based sampling and prediction for small sample sizes from many small populations, for official statistics, and each attribute had its own model weights, so there was not such a problem as having a sample selection probability that is good for the estimator for one data item and terrible for another."
This relates, with regard to ratio estimation, to something I recently noted in a poster presentation on ratio estimation:
If you use efficient probability-of-selection (design-based) sampling for ratio estimation, then the 'optimal' pi-values (probabilities of selection) would only be based on one y (one question on the survey). Such a set of pi values may be quite bad for other survey questions. (Note how this relates to the comical, but instructive story: "Basu's Elephant Fable" from the 1970s.) However, if one uses the model-based (prediction) approach, then the predictions are based on the size measure (using x-values or for multiple regression a combination of regressors such as the predicted-y values) which is appropriate for each individual question, y. The Holmberg(2007) paper I mentioned above may give you a good compromise set of pi-values, and you may use an adjusted design-based ratio estimator (noted, for example, in Cochran(1977), Sampling Techniques, I think), but the model-based option seems preferable to me. It is simple and easy to interpret, and often very efficient, with overall low uncertainty compared to the design-based approach. (See the appendix to https://www.researchgate.net/publication/317914104_Handout_Bibliography_for_Comparison_of_Model-Based_to_Design-Based_Ratio_Estimators_Poster.)
Data Handout Bibliography for "Comparison of Model-Based to Desig...