A multimodal distribution is a probability distribution with two or more modes.
If you create a histogram to visualize a multimodal distribution, you’ll notice that it has more than one peak:
If a distribution has exactly two peaks then it’s considered a bimodal distribution, which is a specific type of multimodal distribution.
This is in contrast to a unimodal distribution, which only has one peak:
Although unimodal distributions like the normal distribution are used most often to explain topics in statistics, multimodal distributions actually appear fairly often in practice so it’s useful to know how to recognize and analyze them.
Examples of Multimodal Distributions
Here are a few examples of multimodal distributions.
Example 1: Distribution of Exam Scores
Suppose a professor gives an exam to his class. Some of the students studied, while others did not. When the professor creates a histogram of the exam scores, it follows a multimodal distribution with one peak around low scores for students who didn’t study and another peak around high scores for students who did study:
Example 2: Height of Different Plant Species
Suppose a scientist goes around a field and measures the height of different plants. Without realizing it, she measures the height of three different species – one that is quite tall, another that is of medium height, and another that is quite short.
When she creates a histogram to visualize the distribution of heights, she finds that it is multimodal – each peak represents the most common height of the three different species.
Example 3: Distribution of Customers
A restaurant owner tracks how many customers visit each hour. When he goes to create a histogram to visualize the distribution of customers, he finds that the distribution is multimodal – there is a peak during lunch hours and another peak during dinner hours.
What Causes Multimodal Distributions?
There are typically one of two underlying causes of multimodal distributions:
1. Multiple groups are lumped together.
Multimodal distributions can occur when you collect data for multiple groups without realizing it.
For example, if a scientist unknowingly measures the height of three different plant species located in the same field, the distribution of all the plants will appear multimodal when placed on the same histogram.
2. There exists an underlying phenomena.
Multimodal distributions can also occur because of some underlying phenomena.
For example, the number of customers who visit a restaurant each hour follows a multimodal distribution since people tend to eat out during two distinct times: lunch and dinner. This underlying human behavior causes the multimodal distribution.
How to Analyze Multimodal Distributions
We often describe distributions using the mean or median since this gives us an idea of where the “center” of the distribution is located.
Unfortunately, the mean and median aren’t useful to know for a bimodal distribution. For example, the mean exam score for students in the example above is 81:
However, very few students actually scored close to 81. In this case, the mean is misleading. Most students actually scored around 74 or around 88.
A better way to analyze and interpret bimodal distributions is to simply break the data into two separate groups, then analyze the location of the center and the spread for each group individually.
For example, we may break up the exam scores into “low scores” and “high scores” and then find the mean and standard deviation for each group.
When calculating summary statistics for a given distribution like the mean, median, or standard deviation, be sure to visualize the distribution to determine if it is unimodal or multimodal.
If a distribution is multimodal, it can be misleading to describe it using a single mean, median, or standard deviation.