I am kind of struggling to understand cluster randomization. How does it work? What do I need to get it done? What exactly changes during the statistics? Any help would be greatly appreciated!
In cluster randomization, groups of subjects (as opposed to individual subjects) are being randomized.Cluster randomized controlled trials are also known as cluster randomized trials, group-randomized trials, and place-randomized trials.
Let us say we would like to assess the effect to two teaching methods (let's assume Method A and B) in a city. There might be lot of schools and it is not possible that each individual student in a class get different teaching than other. So Cluster randomization is used in this case and a school is a unit of randomization. One school will be teaching by method A and another will be by B. Same randomization technique will be used to generate the list but the unit will be school. In cluster randomization unit can be school, community, group of people.
EXAMPLE:
1. To evaluate the effectiveness of vitamin A supplements on childhood mortality.
450 villages in Indonesia were randomly assigned to ether participate in a vitamin A supplementation scheme, or serve as a control. One year mortality rates were compared in the two groups. Sommer et al. (1986)
2. The purpose of the WHO antenatal trial was to compare the impact of two programmes of antenatal care on the health of mothers and newborns.
Unit of Randomization: Antenatal care clinic
ISSUES:
Randomized consent designs have proved quite controversial because of concern for the ethical implications of randomizing subjects prior to obtaining their consent.
Entire clusters, rather than just individuals, may be lost to follow-up
Interventions often applied on a group basis
Increasing cluster size provides diminishing returns in statistical power if e.g. If increasing beyond 100 provides little statistical gain. Donner and Klar (2004)
A key property of cluster randomization trials is ... But under unrestricted cluster randomization, the inflation factor 1+ ... work sites, classrooms, communities).
if the individuals from which you collect data are not independent from each other but instead there are groups ("clusters") of individuals that are (suspected to be) correlated (for example because they are all treated at the same hospital or by the same doctor in a health intervention trial) then the variation in your outcomes will be reduced by a certain factor ("design effect" or "variation inflation factor") due to the correlation within these groups (the "intra-cluster correlation").
basically you do three things to account for this:
- randomize independent clusters instead of dependent individuals (meaning that all individuals which belong to the same cluster are allocated to the same randomly chosen study arm),
- increase your sample size by the design effect and
- use an appropriate model for analyzing your data (often a hierarchical regression model with clustering as random effect).