biology daily - the biology and biochemistry encyclopedia
biology daily articles and research Encyclopedia Dictionary Forums biology research links Weblinks Pictures Articles Blogs Newsletter

Multivariate normal distribution

In probability theory and statistics, a multivariate normal distribution, also sometimes called a multivariate Gaussian distribution (in honor of Carl Friedrich Gauss, who was not the first to write about the normal distribution) is a specific probability density function.

Contents

General case

A random vector X = [X_1, \cdots, X_N] follows a multivariate normal distribution, also sometimes called a multivariate Gaussian distribution, if it satisfies the following equivalent conditions:

  • there is a random vector Z = [Z_1, \cdots, Z_M], whose components are independent standard normal random variables, a vector \mu = [\mu_1, \cdots, \mu_N] and an N \times M matrix A such that X = AZ + μ.
  • there is a vector μ and a symmetric, positive semi-definite matrix Γ such that the characteristic function of X is
\phi_X(u) = \exp \left(  i \mu^T u - \frac{1}{2} u^T \Gamma u \right) .

The following is not quite equivalent to the conditions above, since it fails to allow for a singular matrix as the variance:

  • there is a vector \mu = [\mu_1, \cdots, \mu_N] and a symmetric, positive definite matrix Σ such that X has density
f_X(x_1, \cdots, x_N) = \frac  {1}  {(2\pi)^{N/2} \left|\Sigma\right|^{1/2}} \exp \left(  -\frac{1}{2}  ( x - \mu)^T \Sigma^{-1} (x - \mu) \right)

where \left| \Sigma \right| is the determinant of Σ. Note how the equation above reduces to that of the univariate normal distribution if Σ is a scalar (i.e., a real number).

The vector μ in these conditions is the expected value of X and the matrix Σ = AAT is the covariance matrix of the components Xi.

It is important to realize that the covariance matrix must be allowed to be singular. That case arises frequently in statistics; for example, in the distribution of the vector of residuals in ordinary linear regression problems. Note also that the Xi are in general not independent; they can be seen as the result of applying the linear transformation A to a collection of independent Gaussian variables Z.

Bivariate case

In the 2-dimensional nonsingular case, the probability density function is

f(x,y) = \frac{1}{2 \pi \sigma_x \sigma_y \sqrt{1-\rho^2}} \exp \left(  -\frac{1}{2 (1-\rho^2)}  \left(   \frac{x^2}{\sigma_x^2} +   \frac{y^2}{\sigma_y^2} -   \frac{2 \rho x y}{ (\sigma_x \sigma_y)}  \right) \right)

where ρ is the correlation between X and Y.

Linear transformation

If Y = BX is a linear transformation of X where B is an m \times p matrix then Y has a multivariate normal distribution with expected value Bμand variance BΣBT (i.e., Y ~ N \left(B \mu, B \Sigma B^T\right).

Corollary: any subset of the Xi has a marginal distribution that is also multivariate normal. To see this consider the following example: to extract the subset (X1,X2,X4)T, use

B = \begin{bmatrix}  1 & 0 & 0 & 0 & 0 & \ldots & 0 \\  0 & 1 & 0 & 0 & 0 & \ldots & 0 \\  0 & 0 & 0 & 1 & 0 & \ldots & 0 \end{bmatrix}

which extracts the desired elements directly.

Generating values drawn from the distribution

To generate values from a multivariate normal distribution given μ and A such that X = AZ + μ as detailed above, simply generate a suitable vector of independent standard normal values Z using for example the Box-Muller transform, and apply the foregoing equation.

Given only the covariance matrix Q, one can generate a suitable A using Cholesky decomposition.

Conditional distributions

Then if μ and Σ are partitioned as follows

\mu = \begin{bmatrix}  \mu_1 \\  \mu_2 \end{bmatrix} \quad with sizes \begin{bmatrix} q \times 1 \\ N-q \times 1 \end{bmatrix}
\Sigma = \begin{bmatrix}  \Sigma_{11} & \Sigma_{12} \\  \Sigma_{21} & \Sigma_{22} \end{bmatrix} \quad with sizes \begin{bmatrix} q \times q & q \times N-q \\ N-q \times q & N-q \times N-q \end{bmatrix}

then the distribution of x1 conditional on x2 = a is multivariate normal X1 | X2 = a ~ N(\bar{\mu}, \overline{\Sigma}) where

\bar{\mu} = \mu_1 + \Sigma_{12} \Sigma_{22}^{-1} \left(  a - \mu_2 \right)

and covariance matrix

\overline{\Sigma} = \Sigma_{11} - \Sigma_{12} \Sigma_{22}^{-1} \Sigma_{21}.

This matrix is the Schur complement of {\mathbf\Sigma_{22}} in {\mathbf\Sigma}.

Note that knowing the value of x2 to be a alters the variance; perhaps more surprisingly, the mean is shifted by \Sigma_{12} \Sigma_{22}^{-1} \left(a - \mu_2 \right); compare this with the situation of not knowing the value of a, in which case x1 would have distribution N_q \left(\mu_1, \Sigma_{11} \right).

The matrix \Sigma_{12} \Sigma_{22}^{-1} is known as the matrix of regression coefficients.

Estimation of parameters

The derivation of the maximum-likelihood estimator of the covariance matrix of a multivariate normal distribution is perhaps surprisingly subtle and elegant. See estimation of covariance matrices.



07-14-2008 23:18:10
The contents of this article are licensed from Wikipedia.org under the GNU Free Documentation License. How to see transparent copy
BiologyDaily.com 2005. Legal info   Privacy