The Hamming distance between two vectors is simply the sum of corresponding elements that differ between the vectors.
For example, suppose we have the following two vectors:
x = [1, 2, 3, 4] y = [1, 2, 5, 7]
The Hamming distance between the two vectors would be 2, since this is the total number of corresponding elements that have different values.
To calculate the Hamming distance between two arrays in Python we can use the hamming() function from the scipy.spatial.distance library, which uses the following syntax:
scipy.spatial.distance.hamming(array1, array2)
Note that this function returns the percentage of corresponding elements that differ between the two arrays.
Thus, to obtain the Hamming distance we can simply multiply by the length of one of the arrays:
scipy.spatial.distance.hamming(array1, array2) * len(array1)
This tutorial provides several examples of how to use this function in practice.
Example 1: Hamming Distance Between Binary Arrays
The following code shows how to calculate the Hamming distance between two arrays that each contain only two possible values:
from scipy.spatial.distance import hamming #define arrays x = [0, 1, 1, 1, 0, 1] y = [0, 0, 1, 1, 0, 0] #calculate Hamming distance between the two arrays hamming(x, y) * len(x) 2.0
The Hamming distance between the two arrays is 2.
Example 2: Hamming Distance Between Numerical Arrays
The following code shows how to calculate the Hamming distance between two arrays that each contain several numerical values:
from scipy.spatial.distance import hamming #define arrays x = [7, 12, 14, 19, 22] y = [7, 12, 16, 26, 27] #calculate Hamming distance between the two arrays hamming(x, y) * len(x) 3.0
The Hamming distance between the two arrays is 3.
Example 3: Hamming Distance Between String Arrays
The following code shows how to calculate the Hamming distance between two arrays that each contain several character values:
from scipy.spatial.distance import hamming #define arrays x = ['a', 'b', 'c', 'd'] y = ['a', 'b', 'c', 'r'] #calculate Hamming distance between the two arrays hamming(x, y) * len(x) 1.0
The Hamming distance between the two arrays is 1.
Additional Resources
How to Calculate Euclidean Distance in Python
How to Calculate Mahalanobis Distance in Python
How to Calculate Jaccard Similarity in Python