With stratified sampling (you probably mean stratified random sampling) you break the population into subpopulations which generally have less variance within them, and at best have fairly large differences from one stratum to another, so as to reduce overall variance, as compared to simple random sampling. Even if you really just have different categories, this usually decreases overall variance, but if you also want to publish the categories, that is an additional problem generally requiring more data.
In cluster sampling, you start by breaking your population into groups (clusters), but now these clusters become the sampling unit. You draw some of them, and either census each cluster selected, or do a second stage to the sampling. Multistage sampling is thus a bit more complicated.
Unlike stratified random sampling, cluster sampling is actually less efficient than simple random sampling. However, the larger overall sample size needed is often offset by data collection considerations. If in-person data collection is needed, then it reduces the number of locations to visit. However, because many locations may not be visited at all, the random sampling of clusters had better be good.
Note: Above refers to random sampling for design-based sampling and estimation purposes. Survey statistics, as with other statistics, can make good use of models, but regression modeling is beyond the scope of this discussion.
Cheers - Jim
PS - So for stratified random sampling, you divide your population into parts, and sample from each part. For cluster (random) sampling, you divide your population into parts, and treat those parts as the units from which you draw a random sample.
It may not be very helpful to you, so you may just want to think about my first answer, but interestingly to me, it just occurred to me that a stratified random sample is like a two-stage cluster sample, where the first stage was a census. - Further, a one-stage cluster sample, is a two-stage cluster sample, where the second stage is a census of each selected cluster.
I almost agree with James's explanation about the difference between stratified sampling and cluster sampling. But magbe it will make people a little confusion about cluster sampling and two-stage sampling. Theoretically speaking, stratified sampling and cluster sampling are two special cases of two-stage sampling:
a.Stratified sampling: when all 1st-stage samples are selected (n=N);
b.Cluster sampling: when all 2nd-stage samples in the selected 1st-stage samples are selected (m=M).
For clear understanding, you need read the calculation equations of two-stage sampling. In some references I read, two-stage sampling and cluster sampling are the same or confused.
Interesting, Wei-Sheng. It looks like you repeated my second answer, which actually I expect is not the way you would normally see it explained, but as I said, that works also.
Here is a link to a bibliography I put together for another purpose, but it contains a number of survey statistics textbooks, copies of which I own, which I expect make good references here also. I know that Cochran (1977) does a really good job on stratified random and cluster sampling.
- To be clear, you don't actually have to mention stages of sampling to explain stratification, and you don't have to mention it for one-stage cluster sampling either. You can just say, as I did the first time, that "...for stratified random sampling, you divide your population into parts, and sample from each part. For [one-stage] cluster (random) sampling, you divide your population into parts, and treat those parts as the [primary sampling] units from which you draw a random sample."
Data Handout Bibliography for "Comparison of Model-Based to Desig...
conceptually, your discussion sheds light on the difference between stratified sampling cluster sampling. But, in real life application, i am getting many doubts on classifying a given sampling framework as stratified or random sampling.
For example, I want to study the perceptions of the students in the city on the present educaiton policy of the government. there are 10 colleges in the city and total 500 students are there in the city.
how to develop sampling framework, if i apply stratified sampling or cluster sampling.
If you are trying to compare colleges, then you need data on each, which would be a kind of stratification, but really just a series of simple random samples.
If you just want the optimal overall sample, then that would mean stratified random sampling, unless you had some size measure for unequal probability sampling or modeling. Anyway, stratified random sampling is more efficient than cluster sampling. Cluster sampling could be easier, if you have to travel to administer this survey, but requires a larger overall sample size.
So it sounds like you want stratified random sampling, unless you have logistics problems, generally regarding travel, and would like to reduce the number of trips you make.
It sounds like, in your case, the clusters would probably be the ten colleges. If you only have a universe of N=10 from which to draw for your (first stage) selection, that sounds too coarse to work well. If however they were 10 strata instead, and you drew from each and every one of them, I think results would more likely be much better.
So, hopefully you can use stratified random sampling.
Dear Srikanth, Good morning from Thessaloniki, Greece.
The main diference between the Stratified Random Sampling (SRS) and the Cluster One is that:
In SRS you have to decide about the strata (under variance within temselfes criteria) and for Custer Sampling the Clusters are redy (already defined) waiting for you take a random sample from their set. They are the sample units.
Dr Nikolaos FARMAKIS Assoc Professor On Statistics