In probability theory and statistics, a multivariate normal distribution, also sometimes called a multivariate Gaussian distribution (in honor of Carl Friedrich Gauss, who was not the first to write about the normal distribution) is a specific probability density function.
General case
A random vector
follows a multivariate normal distribution, also sometimes called a multivariate Gaussian distribution, if it satisfies the following equivalent conditions:
- there is a random vector
, whose components are independent standard normal random variables, a vector
and an
matrix A such that X = AZ + μ.
The following is not quite equivalent to the conditions above, since it fails to allow for a singular matrix as the variance:
- there is a vector
and a symmetric, positive definite matrix Σ such that X has density
where
is the determinant of Σ.
Note how the equation above reduces to that of the univariate normal distribution if Σ is a scalar (i.e., a real number).
The vector μ in these conditions is the expected value of X and the matrix Σ = AAT is the covariance matrix of the components Xi.
It is important to realize that the covariance matrix must be allowed to be singular.
That case arises frequently in statistics; for example, in the distribution of the vector of residuals in ordinary linear regression problems.
Note also that the Xi are in general not independent; they can be seen as the result of applying the linear transformation A to a collection of independent Gaussian variables Z.
Bivariate case
In the 2-dimensional nonsingular case, the probability density function is
where ρ is the correlation between X and Y.
Linear transformation
If Y = BX is a linear transformation of X where B is an
matrix then Y has a multivariate normal distribution with expected value Bμand variance BΣBT (i.e., Y ~
.
Corollary: any subset of the Xi has a marginal distribution that is also multivariate normal.
To see this consider the following example: to extract the subset (X1,X2,X4)T, use
which extracts the desired elements directly.
Generating values drawn from the distribution
To generate values from a multivariate normal distribution given μ and A such that X = AZ + μ as detailed above, simply generate a suitable vector of independent standard normal values Z using for example the Box-Muller transform, and apply the foregoing equation.
Given only the covariance matrix Q, one can generate a suitable A using Cholesky decomposition.
Conditional distributions
Then if μ and Σ are partitioned as follows
with sizes
with sizes
then the distribution of x1 conditional on x2 = a is multivariate normal X1 | X2 = a ~
where
and covariance matrix
This matrix is the Schur complement of
in
.
Note that knowing the value of x2 to be a alters the variance; perhaps more surprisingly, the mean is shifted by
; compare this with the situation of not knowing the value of a, in which case x1 would have distribution
.
The matrix
is known as the matrix of regression coefficients.
Estimation of parameters
The derivation of the maximum-likelihood estimator of the covariance matrix of a multivariate normal distribution is perhaps surprisingly subtle and elegant. See estimation of covariance matrices.