The interquartile range, often denoted “IQR”, is a way to measure the spread of the middle 50% of a dataset. It is calculated as the difference between the first quartile* (the 25th percentile) and the third quartile (the 75th percentile) of a dataset.
Fortunately it’s easy to calculate the interquartile range of a dataset in Python using the numpy.percentile() function.
This tutorial shows several examples of how to use this function in practice.
Example 1: Interquartile Range of One Array
The following code shows how to calculate the interquartile range of values in a single array:
import numpy as np #define array of data data = np.array([14, 19, 20, 22, 24, 26, 27, 30, 30, 31, 36, 38, 44, 47]) #calculate interquartile range q3, q1 = np.percentile(data, [75 ,25]) iqr = q3 - q1 #display interquartile range iqr 12.25
The interquartile range of this dataset turns out to be 12.25. This is the spread of the middle 50% of values in this dataset.
Example 2: Interquartile Range of a Data Frame Column
The following code shows how to calculate the interquartile range of a single column in a data frame:
import numpy as np import pandas as pd #create data frame df = pd.DataFrame({'rating': [90, 85, 82, 88, 94, 90, 76, 75, 87, 86], 'points': [25, 20, 14, 16, 27, 20, 12, 15, 14, 19], 'assists': [5, 7, 7, 8, 5, 7, 6, 9, 9, 5], 'rebounds': [11, 8, 10, 6, 6, 9, 6, 10, 10, 7]}) #calculate interquartile range of values in the 'points' column q75, q25 = np.percentile(df['points'], [75 ,25]) iqr = q75 - q25 #display interquartile range iqr 5.75
The interquartile range of values in the points column turns out to be 5.75.
Example 3: Interquartile Range of Multiple Data Frame Columns
The following code shows how to calculate the interquartile range of multiple columns in a data frame at once:
import numpy as np import pandas as pd #create data frame df = pd.DataFrame({'rating': [90, 85, 82, 88, 94, 90, 76, 75, 87, 86], 'points': [25, 20, 14, 16, 27, 20, 12, 15, 14, 19], 'assists': [5, 7, 7, 8, 5, 7, 6, 9, 9, 5], 'rebounds': [11, 8, 10, 6, 6, 9, 6, 10, 10, 7]}) #define function to calculate interquartile range def find_iqr(x): return np.subtract(*np.percentile(x, [75, 25])) #calculate IQR for 'rating' and 'points' columns df[['rating', 'points']].apply(find_iqr) rating 6.75 points 5.75 dtype: float64 #calculate IQR for all columns df.apply(find_iqr) rating 6.75 points 5.75 assists 2.50 rebounds 3.75 dtype: float64
Note: We use the pandas.DataFrame.apply() function to calculate the IQR for multiple columns in the data frame above.
Additional Resources
Is the Interquartile Range (IQR) Affected By Outliers?
How to Calculate the Interquartile Range (IQR) in Excel
Interquartile Range Calculator