Which guidelines do we follow in selecting sample size?

Selecting Sample Sizes

Consider these things when selecting a sample sizeWhen choosing a sample size, we must consider the following issues:

What population parameters we want to estimate
Cost of sampling (importance of information)
How much is already known
Spread (variability) of the population
Practicality: how hard is it to collect data
How precise we want the final estimates to be

Cost of taking samplesThe cost of sampling issue helps us determine how precise our estimates should be. As we will see below, when choosing sample sizes we need to select risk values. If the decisions we will make from the sampling activity are very valuable, then we will want low risk values and hence larger sample sizes.Prior informationIf our process has been studied before, we can use that prior information to reduce sample sizes. This can be done by using prior mean and variance estimates and by stratifying the population to reduce variation within groups.Inherent variabilityWe take samples to form estimates of some characteristic of the population of interest. The variance of that estimate is proportional to the inherent variability of the population divided by the sample size:

Var(p^)≈σ2n

with p^ denoting the parameter we are trying to estimate. This means that if the variability of the population is large, then we must take many samples. Conversely, a small population variance means we don't have to take as many samples.

PracticalityOf course the sample size you select must make sense. This is where the trade-offs usually occur. We want to take enough observations to obtain reasonably precise estimates of the parameters of interest but we also want to do this within a practical resource budget. The important thing is to quantify the risks associated with the chosen sample size.Sample size determinationIn summary, the steps involved in estimating a sample size are:

There must be a statement about what is expected of the sample. We must determine what is it we are trying to estimate, how precise we want the estimate to be, and what are we going to do with the estimate once we have it. This should easily be derived from the goals.

We must find some equation that connects the desired precision of the estimate with the sample size. This is a probability statement. A couple are given below; see your statistician if these are not appropriate for your situation.

This equation may contain unknown properties of the population such as the mean or variance. This is where prior information can help.

If you are stratifying the population in order to reduce variation, sample size determination must be performed for each stratum.

The final sample size should be scrutinized for practicality. If it is unacceptable, the only way to reduce it is to accept less precision in the sample estimate.

Sampling proportionsWhen we are sampling proportions we start with a probability statement about the desired precision. This is given by:

Pr(|p^−P|≥δ)=αwhere

p^ is the estimated proportion
P is the unknown population parameter
δ is the specified precision of the estimate
α is the probability value (usually low)

This equation simply shows that we want the probability that the precision of our estimate being less than we want isα. Of course we like to set α low, usually .1 or less. Using some assumptions about the proportion being approximately normally distributed we can obtain an estimate of the required sample size as:

n=z2α(pqδ2)

where z is the ordinate on the Normal curve corresponding to α.

ExampleLet's say we have a new process we want to try. We plan to run the new process and sample the output for yield (good/bad). Our current process has been yielding 65% (p=.65, q=.35). We decide that we want the estimate of the new process yield to be accurate to within δ = .10 at 95% confidence (α = .05, zα = -2). Using the formula above we get a sample size estimate of n=91. Thus, if we draw 91 random parts from the output of the new process and estimate the yield, then we are 95% sure the yield estimate is within .10 of the true process yield.Estimating location: relative errorIf we are sampling continuous normally distributed variables, quite often we are concerned about the relative error of our estimates rather than the absolute error. The probability statement connecting the desired precision to the sample size is given by:

Pr(∥∥y^−μμ∥∥≥δ))=α

where μ is the (unknown) population mean and y¯ is the sample mean.

Again, using the normality assumptions we obtain the estimated sample size to be:

n≈z2ασ2δ2μ2

with σ2 denoting the population variance.

Estimating location: absolute errorIf instead of relative error, we wish to use absolute error, the equation for sample size looks alot like the one for the case of proportions:

n≈z2α(σ2δ2)

where σ is the population standard deviation (but in practice is usually replaced by an engineering guesstimate).

ExampleSuppose we want to sample a stable process that deposits a 500 Angstrom film on a semiconductor wafer in order to determine the process mean so that we can set up a control chart on the process. We want to estimate the mean within 10 Angstroms (δ = 10) of the true mean with 95% confidence (α = .05, zα = -2). Our initial guess regarding the variation in the process is that one standard deviation is about 20 Angstroms. This gives a sample size estimate of n= 16. Thus, if we take at least 16 samples from this process and estimate the mean film thickness, we can be 95% sure that the estimate is within 10 angstroms of the true mean value.

Joseph L Alvarez

There are no concrete rules or quidelines. You must evaluate good enough based on variability and of the sample population and needs of the question asked.

Abiodun Olusola Omotayo

I will do on Monday sir

Harry Barton Essel

Clearly state the target group (universe) definition for the research study; in the case of online surveys this includes explicit identification of whether or not Internet non-users are part of the target group definition
Clearly state the method(s) used to obtain a sample of this target group, including whether the method was a probability survey, a non-probability survey, or an attempted census

http://www.tpsgc-pwgsc.gc.ca/rop-por/rapports-reports/comiteenligne-panelonline/page-03-eng.html

Munaf A. Al-Ramahee

I have no idea. Thank you for sharing your question with me.

AL Timimi Zahra

Sample Size Calculator

Sample Size Calculator is presented as a public service of Creative Research Systems survey software.

You can use it to determine how many people you need to interview in order to get results that reflect the target population as precisely as needed.

You can also find the level of precision you have in an existing sample.

Before using the sample size calculator, there are two terms that you need to know.

### These are: confidence interval and confidence level.

Please refer to this website for more information.

https://www.surveysystem.com/sscalc.htm

Isam Issa Omran

The size of the appropriate sample depends on the purpose for which the study is conducted, the nature of the research community, the variables of the study, and the pattern of relationships that he wishes to disclose. The size of the sample can be inferred from previous studies, if any, especially those that have the same research design. The increase in sample size can provide higher characteristics of the community and thus a more accurate generalization of the research results.

Aparna Sathya Murthy

For image processing I use consistency metric !!

Robust for n number of samples !!!

Document the why nots !!!

Esraa T. Al-Azawee

Good question.

Following

Hamid A. Al-jameel

Good question-

Senad Bećirović

It depends on your research problem.

Priyanka Mehta

following

Hitesh Gujarati

There are no guidelines. Follow Guide's guideline

Dr K N Sheth

I express my gratitude to Research Gate RG Colleagues for their excellent responses. Let me specially thank Jack Son, Aparna Sathya Murti, Al Timmi Zahara, Harry Barton Essel, Isam Issa Omran and Hitesh Gujarati for their answers

Timothy A Ebert

Depends on the sample size that other researchers are willing to see in the published literature.

Depends on the level of risk you are comfortable with taking on. Results from small sample sizes often cannot be repeated even by the scientist that did the original work. The larger the sample size the more certain it is that other researchers who use a similar sample size will arrive at similar outcomes.

Why should teacher be asked to retire ?

Are personality test indicators are reliable?

To remain safe from Diabetes, should the people go for Sugarfree tablets or powder when they don't have diabetes?

All our B Schools imparting MBA are now said to be trade schools. What do you feel?

How do we systematically undertake Case study in Management programmes?

The man who graduates today and stops learning tomorrow is uneducated the day after. Do you agree?

What should be the ideal content of a web site ?

Which could be considered as sustainability themes?

What is reinforced soil wall with geo grid ?

What is cosnic mind power?

How to fix errors in my heat transfer steel structure with reinforced concrete slab model Abaqus?

A paper on a fossil lycopod?

Which distribution type should I use when calculating the average particle size from TEM image? and how to calculate the error ?

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

How to calculate effect size of AMCE (Average Marginal Component Effect) in Randomized Conjoint Experiment?

I am looking for the permission to use the OLBI survey. Does anyone know where I can obtain that information?

How to conduct a sensitivity power analysis for Kendall's Tau?

How to estimate sample size for GWAS of continuous and discrete traits? What are the pre-requisites?

What is the best method for removing paraffin from plant samples prepared for microtome?

How many samples size should I select to compare both groups?