The Hamming distance between two vectors is simply the sum of corresponding elements that differ between the vectors.
For example, suppose we have the following two vectors:
x = [1, 2, 3, 4] y = [1, 2, 5, 7]
The Hamming distance between the two vectors would be 2, since this is the total number of corresponding elements that have different values.
To calculate the Hamming distance between two columns in Excel, we can use the following syntax:
=COUNT(RANGE1)-SUMPRODUCT(--(RANGE1 = RANGE2))
Here’s what the formula does in a nutshell:
- COUNT finds the total number of observations in the first column.
- RANGE1 = RANGE2 compares each pairwise observations between the columns and returns a TRUE or FALSE.
- – – converts TRUE and FALSE values to 0 and 1.
- SUMPRODUCT finds the sum of all 1’s.
This tutorial provides several examples of how to use this calculation in practice.
Example 1: Hamming Distance Between Binary Vectors
The following code shows how to calculate the Hamming distance between two columns in Excel that each contain only two possible values:
The Hamming distance between the two columns is 3.
Example 2: Hamming Distance Between Numerical Vectors
The following code shows how to calculate the Hamming distance between two columns in Excel that each contain several numerical values:
The Hamming distance between the two vectors is 7.
Additional Resources
How to Calculate the Mean Absolute Deviation in Excel
How to Normalize Data in Excel
How to Find Outliers in Excel