A probability distribution tells us the probability that a random variable takes on certain values.
For example, the following probability distribution tells us the probability that a certain soccer team scores a certain number of goals in a given game:
To find the expected value of a probability distribution, we can use the following formula:
μ = Σx * P(x)
where:
- x: Data value
- P(x): Probability of value
For example, the expected number of goals for the soccer team would be calculated as:
μ = 0*0.18 + 1*0.34 + 2*0.35 + 3*0.11 + 4*0.02 = 1.45 goals.
To calculate expected value of a probability distribution in Python, we can define a simple function:
import numpy as np def expected_value(values, weights): values = np.asarray(values) weights = np.asarray(weights) return (values * weights).sum() / weights.sum()
The following example shows how to use this function in practice.
Example: Calculating Expected Value in Python
The following code shows how to calculate the expected value of a probability distribution using the expected_value() function we defined earlier:
#define values
values = [0, 1, 2, 3, 4]
#define probabilities
probs = [.18, .34, .35, .11, .02]
#calculate expected value
expected_value(values, probs)
1.450000
The expected value is 1.45. This matches the value that we calculated earlier by hand.
Note that this function will return an error if the length of the values array and the probabilities array are not equal.
For example:
#define values
values = [0, 1, 2, 3, 4]
#define probabilities
probs = [.18, .34, .35, .11, .02, .05, .11]
#attempt to calculate expected value
expected_value(values, probs)
ValueError: operands could not be broadcast together with shapes (5,) (7,)
We receive an error because the length of the first array is 5 while the length of the second array is 7.
In order for this expected value function to work, the length of both arrays must be equal.
Additional Resources
The following tutorials explain how to calculate other metrics in Python:
How to Calculate a Trimmed Mean in Python
How to Calculate Geometric Mean in Python
How to Calculate the Standard Error of the Mean in Python