When we’d like to test whether or not a single variable is normally distributed, we can create a Q-Q plot to visualize the distribution or we can perform a formal statistical test like an Anderson Darling Test or a Jarque-Bera Test.
However, when we’d like to test whether or not several variables are normally distributed as a group we must perform a multivariate normality test.
This tutorial explains how to perform the following multivariate normality tests for a given dataset in R:
- Mardia’s Test
- Energy Test
- Multivariate Kurtosis and Skew Tests
Related: If we’d like to identify outliers in a multivariate setting, we can use the Mahalanobis distance.
Example: Mardia’s Test in R
Mardia’s Test determines whether or not a group of variables follows a multivariate normal distribution. The null and alternative hypotheses for the test are as follows:
H0 (null): The variables follow a multivariate normal distribution.
Ha (alternative): The variables do not follow a multivariate normal distribution.
The following code shows how to perform this test in R using the QuantPsyc package:
library(QuantPsyc) #create dataset set.seed(0) data #perform Multivariate normality test mult.norm(data)$mult.test Beta-hat kappa p-val Skewness 1.630474 13.5872843 0.1926626 Kurtosis 13.895364 -0.7130395 0.4758213
The mult.norm() function tests for multivariate normality in both the skewness and kurtosis of the dataset. Since both p-values are not less than .05, we fail to reject the null hypothesis of the test. We don’t have evidence to say that the three variables in our dataset do not follow a multivariate distribution.
Example: Energy Test in R
An Energy Test is another statistical test that determines whether or not a group of variables follows a multivariate normal distribution. The null and alternative hypotheses for the test are as follows:
H0 (null): The variables follow a multivariate normal distribution.
Ha (alternative): The variables do not follow a multivariate normal distribution.
The following code shows how to perform this test in R using the energy package:
library(energy) #create dataset set.seed(0) data #perform Multivariate normality test mvnorm.etest(data, R=100) Energy test of multivariate normality: estimated parameters data: x, sample size 50, dimension 3, replicates 100 E-statistic = 0.90923, p-value = 0.31
The p-value of the test is 0.31. Since this is not less than .05, we fail to reject the null hypothesis of the test. We don’t have evidence to say that the three variables in our dataset do not follow a multivariate distribution.
Note: The argument R=100 specifies 100 boostrapped replicates to be used when performing the test. For datasets with smaller sample sizes, you may increase this number to produce a more reliable estimate of the test statistic.
Additional Resources
How to Create & Interpret a Q-Q Plot in R
How to Conduct an Anderson-Darling Test in R
How to Conduct a Jarque-Bera Test in R
How to Perform a Shapiro-Wilk Test in R