I supposed to collect data from 384 respondents. But, I only get 230 complete responses in return. In this case, my response rate is only 60%. Is it acceptable?
Yours is fundamental issue in survey research - and merits our continuing attention ! The 60% response rate is strong - and meets an acceptable standard.
Toward attaining response rates, see a classic: D. Dillman, Mail and Internet Surveys.
The response rate is good. However, prior to conducting a survey it is best to calculate the number of responses required to achieve 5% margin of error at 95% confidence and aim to get more responses than this.
Many of the online survey sites have a tool to calculate this based on the size of the population being surveyed. You can use this to calculate the margin of error and confidence level from the number of responses you have.
A statistical acceptable response rate will be the valid survey response rate when it is equal to the minimum sample size decided by the researcher (Cochran’ or Yamane’s formula).
Nonresponse is much more complicated than a single response rate statistic. The observations obtained are not like observations obtained in a designed sample. Even "Big Data" has problems because the distribution of the unavailable data make a huge difference.
Also, the size of respondents matters. Consider the establishment survey data collected at the official energy statistics agency where I worked. Some respondents reported values many times larger than the smallest responses.
Your problem is likely to be whether or not you are missing most or all of certain categories or subpopulations. You could research "response propensity groups." You may want to weight certain responses more than others.
So actually, 60% response might be OK or it might be horribly misleading. It depends on 'what' is missing as much as, or more than, 'how much.'
Do you have any auxiliary data on the entire population? If you have continuous data, then the auxiliary data could be some other continuous data that might be used as predictor data for a model-based approach, or if you had other quantitative data you might work out some arrangement, or if you had any kind of data, and the auxiliary information only let you stratify or otherwise group your data well, so that you could consider the impact of nonresponse on each subpopulation, then that could be very helpful. As I was told regarding a mentor's (Ken Brewer's) mentor (Ken Foreman), as I recall, there is "no substitute" for being knowledgeable about the data with which you work. So consider what other information you have about the population on which you are collecting data. What else do you know about it that might tell you if you are missing something important?
It completely depends both on the context and on reason for nonresponse. Saying a particular number is good or bad without more knowledge of the context than you provide would be wrong.
A 60% response rate would have a huge dent on the internal and external validity of your study. However, you may have to justify your response rate. Why were you unable to reach the sample size? If the study participants were not reachable as in hard to reach this could be justified. A 60% response could also mean that the participants are not just interested in the study. It possible 60% would be considered but you would have a hard time with the reviewers. Just prepare a convincing justification with some literature to back it.
Whilst this question was asked more than a year before my response, I think it is a recurring question (and might be so for a while). As such, here are my two cents on the matter:
First off, whether a response rate is good or bad is not a straightforward question. It is highly dependent on so many factors, some of which James R Knaub and Daniel Wright have highlighted. The bottom line is that sound justification is necessary whenever the sample size is not attainable. This should also be factored in the analysis and interpretation of results, especially for inferential statistics which largely rely on sufficiency and representativeness of sample collected.
Secondly, it is worth noting the effects of small samples - sampling bias. The obvious rule is the lower the response rate, the higher the risk of sampling bias, which has dire consequences in generalizability of inferential analysis. In addition, when calculating the minimum required sample (regardless of the method used), one should add 10% for attrition or non response. So, if you had factored this in your calculation, the response rate becomes 66.6% from the 59.9%, further improving your response rate.
Finally, acceptable response rates differ across disciplines and study designs. For example, in a study in the general population whose aim is to describe some behaviour, a 60% response rate is okay. For a randomized clinical trial testing for comparative efficacy of a drug or procedure, a response rate lower than 90% is generally.
Conclusion:
There's no magic number for response rate - the higher the better. If you can surpass the sample size with the allocated resources, by all means do it. This improves the credibility of your results. When you can't achieve the minimum required sample size, explore the population composition vis a vis the sample composition, justify the low response rate, practice caution in your analysis and interpretation of results.
Peter, I have a problem with "If you can surpass the sample size with the allocated resources, by all means do it. This improves the credibility of your results." This may not help. If, for example, under your probability sampling and estimation or regression prediction approach you needed 200 observations, and you tried to collect 400, and had 100 nonrespondents, the fact that 300 responses is greater than 200 is not relevant. With all of those nonrespondents, you may well have terribly "biased" results. That is, the population you are inferring to is not the one you say it is. The problem is made worse in some situations more than others. For example, you could have a stratified random sample, and with all of those nonresponses, perhaps you may not have any responses for a particular stratum. You cannot just collect data, ignoring nonrespondents until you get a number higher than your methodology and standard deviation(s) called for to reach a desired standard error. In some cases people use substitution as a form of imputation to cover nonresponse, but it has to be done carefully, and may greatly degrade results. It is risky in a model-based approach, but even more risky for probability sampling. In the latter case, it violates the underlying approach.
So, in general, one cannot just keep trying to collect data, ignoring nonresponses until you obtain a certain number of responses or more. You have to have other information when you impute. (If you impute by regression, you need auxiliary/predictor data.)
BTW, the "sample size calculator" that Rajkumar Rajendram provided appears to be for yes/no (proportions) data, assuming simple random sampling, and likely no finite population correction factor. You cannot use it in other cases if those are the assumptions upon which it was built, which is typical for "online calculators." It mentions a maximum sample size when p=q=0.5, because that maximizes the standard deviation, a special phenomenon which applies to proportions, but not to continuous data.
A survey response rate of 50% or higher should be considered excellent in most circumstances. A high response rate is likely driven by high levels of motivation to complete the survey, or a strong personal relationship between business and customer. Survey response rates in the 5% to 30% range are far more typical.
James R Knaub, I totally agree with on this point: do not disregard non response and continue collecting data until you achieve or surpass your minimum required sample size. My argument for 'the larger the sample the better your accuracy' however is predicated on the rule of 'the best sample size is the entire population'. I am not disregarding the composition of the sample size but assuming - among other things - homogeneity of the population and randomness in the sampling technique. I agree that the biases are exacerbated in say, stratified sampling, especially if one does not account for the same in their efforts to get a higher response rate.
I hope this clarifies my point in that the large sample size is in relation to the population while upholding the sample composition, assumptions for the sampling method employed and not primarily having a higher response rate, at the peril of violating the above considerations.
Well, a problem with "Big Data" is that even if you are trying to obtain as much of a finite population as you can and do collect most of it, the small part you miss can be so different from the rest of the population that your results can be very misleading.
For one thing, nonrespondents may have a very different average (unknown) value for an item, as compared to respondents. That means you cannot just collect whatever you can. Models using other known data as predictors, and other techniques may help, but just collecting whatever you can and leaving it at that could be very problematic.
(One technique is to go to nonrespondents for a second try, and if some respond, then you can compare those responses, under the same stratum or subpopulation, with first time responders, and see how much change you find. Then do this again with the still remaining nonrespondents and see the next change. This may give you an idea about the remaining nonresponders. That may work with a large sample, but I suppose it may just irritate the nonresponders and give you a mess. Also, those who would never respond may be very different. This is why nonresponse can be such a big problem, and one of unknown size that may not be guessed well. - In official establishment surveys, where response is required, this is not such a problem, but in many surveys it can be devastating.)
the percentage you retrieved can be accepted but it is advisable that you add either 10 or 20, 30, percents to prevent you from going below your initial sample size
You usually cannot just substitute one respondent for another who did not respond unless you know it is all completely random, or you are substituting what you know is a similar unit. There could be a reason for the nonresponse that makes it different from your substitution, which would then provide a statistically biased result. This is true with a probability of selection (i.e., random design) approach. For a typical non-probability approach, inference is not really possible anyway, without covariates or a pseudorandom approach, which is quite involved and tenuous. For a model-based (regression/prediction) approach, you might have a better result, but you need good regressor data on the entire population. It is basically a type of imputation, with variance estimation possible.
Just sampling until you get a sample size that you want is not a good approach, unless you are using a model-based approach, and it can still be a substantial problem there.
Please see my responses above, and the response by Daniel Wright.
Whatever the "x," it is not correct to say x% response is always acceptable. (By the way, even at 100%, there can be very low data quality.) And it can depend on what is missing, such that you could find results from a lower response rate more accurate than from a higher one, in the same situation. This is especially true in establishment surveys with highly skewed data.
Dear Kavitha Selvaraja , non-response rate of about 40% is very high. Sometimes, the issue is from the accessibility to samples that represent the population. You have to ensure the respondents have experience with related questions. Another to consider is that the questions are valid and reliable, and the respondents could easily understand your questionnaire. Anyway, the response rate of online surveys could be lower than that of printed questionnaires.
What was your initial sample size? Did you already factor 10% non-response rate in your initial sample size calculation. Because the high non-response rate might call for questioning of the representativeness of your sample, you might have violated some sample size assumptions. Then you might need to capture it under the limitations of your study. Best
Having high response rate is just luck. Participation is research is a complicated process. Sampling is something, response rate is another. e.g. Some respondents have gatekeepers who may allow or deny participation. The type of study as in qualitative focuses on the richness of data as to the stats of respondents. The study may quickly reach saturation which may warrant not pursuing data collection. Low response is relative. State in limitations to assist other researchers with similar interest.