The range and standard deviation are two ways to measure the spread of values in a dataset.
The range represents the difference between the minimum value and the maximum value in a dataset.
The standard deviation measures the typical deviation of individual values from the mean value. It is calculated as:
s = √(Σ(xi – x)2 / (n-1))
where:
- Σ: A symbol that means “sum”
- xi: The value of the ith observation in the sample
- x: The mean of the sample
- n: The sample size
For example, suppose we have the following dataset:
Dataset: 1, 4, 8, 11, 13, 17, 19, 19, 20, 23, 24, 24, 25, 28, 29, 31, 32
The range is calculated as: 31 -1 = 32.
We can use a calculator to find that the standard deviation is 9.25.
Range vs. Standard Deviation: Similarities & Differences
The range and standard deviation share the following similarity:
- Both metrics measure the spread of values in a dataset.
However, the range and standard deviation have the following difference:
- The range tells us the difference between the largest and smallest value in the entire dataset.
- The standard deviation tells us the typical deviation of individual values from the mean value in the dataset.
Range vs. Standard Deviation: When to Use Each
We should use the range when we’re interested in understanding the difference between the largest and smallest values in a dataset.
For example, suppose a professor administers an exam to 100 students. She can use the range to understand the difference between the highest score and the lowest score received by all of the students in the class.
Conversely, we should use the standard deviation when we’re interested in understanding how far the typical value in a dataset deviates from the mean value.
For example, if a professor administers an exam to 100 students, she can use the standard deviation to quantify how far the typical exam score deviates from the mean exam score.
It’s worth noting that we don’t have to choose between using the range or the standard deviation to describe the spread of values in a dataset. We can use both metrics since they provide us with completely different information.
The Drawbacks of the Range & Standard Deviation
Both the range and the standard deviation suffer from one drawback: They are both influenced by outliers.
To illustrate this, consider the following dataset:
Dataset: 1, 4, 8, 11, 13, 17, 19, 19, 20, 23, 24, 24, 25, 28, 29, 31, 32
We can calculate the following values for the range and the standard deviation of this dataset:
- Range: 31
- Standard Deviation: 9.25
However, consider if the dataset had one extreme outlier:
Dataset: 1, 4, 8, 11, 13, 17, 19, 19, 20, 23, 24, 24, 25, 28, 29, 31, 32, 378
We could use a calculator to find the following metrics for this dataset:
- Range: 377
- Standard Deviation: 85.02
Notice how both the range and the standard deviation change dramatically as a result of one outlier.
Although the range and standard deviation can be useful metrics to gain an idea of how spread out values are in a dataset, you need to first make sure that the dataset has no outliers that are influencing these metrics. Otherwise, the range and the standard deviation can be misleading.