if we have a list of patients and we want to select a representative sample from them. after sample size calculation, we need to randomly select patients' files from the big list. What is the best randomization method to do that?
The easiest method is simple randomization. If you assign subjects into two groups A and B, you assign subjects to each group purely randomly for every assignment. Even though this is the most basic way, if the total number of samples is small, sample numbers are likely to be assigned unequally. For this reason, we recommend you to use this method when the total number of samples is more than 100.
Block Randomization
We can create a block to assign sample numbers equally to each group and assign the block.
If we specify two in one block (the so-called block size is two), we can make two possible sequences of AB and BA. When we randomize them, the same sample numbers can be assigned to each group. If the block size is four, we can make six possible sequences; these are AABB, ABAB, ABBA, BAAB, BABA, BBAA, and we randomize them.
However, there is a disadvantage in that the executer can predict the next assignment. We can easily know the fact that B comes after A if the block size is two and if the block size is four; we can predict what every 4th sample is. This is discordant with the principle of randomization. To solve this problem, the allocator must hide the block size from the executer and use randomly mixed block sizes. For example, the block size can be two, four, and six.
Stratified Randomization
Randomization is important because it is almost the only way to assign all the other variables equally except for the factor (A and B) in which we are interested. However, some very important confounding variables can often be assigned unequally to the two groups. This possibility increases when the number of samples is smaller, and we can stratify the variables and assign the two groups equally in this case.
For example, if the smoking status is very important, what will you do? First, we have two methods of randomization that we learned previously. There are two randomly assigned separate sequences for smokers and non-smokers. Smokers are assigned to the smoker's sequences, and non-smokers are assigned to the non-smoker's sequences. Therefore, both smokers and non-smokers groups will be placed equally with the same numbers.
So we can use 'simple randomization with/without stratification' or 'block randomization with/without stratification.' However, if there are multiple stratified variables, it is difficult to place samples in both groups equally with the same numbers. Usually two or fewer stratified variables are recommended.
Go to:
EXAMPLES OF RANDOMIZATION
Although there are websites or common programs for randomization, let us use an Excel file. Download the attached file in http://cafe.naver.com/easy2know/6427. It is in a 'Read-only' state, but there is no limit in function; it is in the 'Read-only' state only to prevent accidental modification.
Due to the nature of Excel, if there is a change, it creates a new random number accordingly. If we input any number instead of '2' in the orange-colored cell and click the 'enter key,' it creates new random sequences. The sequences are the result of simple randomization. The numbers in the right column show the numbers of the total sample. Basically the numbers are up to 1,000, but if you need to, you can extend the numbers with the AutoFill function in Excel.
actually, there will be no groups. It is just selecting a group of patients from a list. selection should be randomized, each subject's file to have the same chance of being selected. My thought was to generate random sample using Excel software, I know how to do it.
But I wanted to make sure, my step and process is right, and my sample won't be biased or has any selection error.
For the simple selection process that you need to have in this study, incorporating a random number to select patient records is the best approach.
You could for example use the last four digits of the patients' ID number and select every 20th one (e.g., patient numbers 1020, 1040, 1060, etc). But you should be sure that the patient ID numbers don't cluster on some other factor, such as all patients with numbers under 5000 being from only one district, or one limited age group. You want to ensure that the pool of patients you are selecting from represent the traits you want, and then ensure that each patient has a theoretically equal chance of being sampled. You could use day of birth.
And, as a professor once told me, "randomization isn't always your friend." Especially if your sample size is small, you could just by chance end up with an unbalanced sample even if it was chosen randomly---you could have too many females, too few older people, and so on, just from chance alone. To avoid this it is useful to randomize on more than one variable. For example you might select half your patients randomly from the male patients, and half randomly from the female patients. That way you will ensure that half are male and half are female, but within each gender they are randomly selected based on their day of birth for example.