In the field of statistics, randomization refers to the act of randomly assigning subjects in a study to different treatment groups.
For example, suppose researchers recruit 100 subjects to participate in a study in which they hope to understand whether or not two different pills have different effects on blood pressure.
They may decide to use a random number generator to randomly assign each subject to use either pill #1 or pill #2.
Benefits of Randomization
The point of randomization is to control for lurking variables – variables that are not directly included in an analysis, yet impact the analysis in some way.
For example, if researchers are studying the effects of two different pills on blood pressure then the following lurking variables could affect the analysis:
- Smoking habits
- Diet
- Exercise
By randomly assigning subjects to treatment groups, we maximize the chances that the lurking variables will affect both treatment groups equally.
This means any differences in blood pressure can be attributed to the type of pill, rather than the effect of a lurking variable.
Block Randomization
An extension of randomization is known as block randomization. This is the process of first separating subjects into blocks, then using randomization to assign subjects within blocks to different treatments.
For example, if researchers want to know whether or not two different pills affect blood pressure differently then they may first separate all subjects into one of two blocks based on gender: Male or Female.
Then, within each block they can use randomization to randomly assign subjects to use either Pill #1 or Pill #2.
The benefit of this approach is that researchers can directly control for any effect that gender may have on blood pressure since we know that males and females are likely to respond to each pill differently.
By using gender as a block, we’re able to eliminate this variable as a potential source of variation. If there are differences in blood pressure between the two pills then we can know that gender is not the underlying cause of these differences.
Additional Resources
Blocking in Statistics: Definition & Example
Permuted Block Randomization: Definition & Example
Lurking Variables: Definition & Examples