Continuous Multivariate distributions

Multivariate Gaussian, a.k.a. Multivariate Normal, distribution

  • Story. This is a generalization of the univariate Gaussian.
  • Example. Finch beaks are measured for beak depth and beak length. The resulting distribution of depths and length is Gaussian distributed. In this case, the Gaussian is bivariate, with \(μ=(μ_d,μ_l)\) and \(\begin{align}
    \mathsf{\Sigma} = \begin{pmatrix}
    \sigma_\mathrm{d}^2 & \sigma_\mathrm{dl} \\
    \sigma_\mathrm{dl} & \sigma_\mathrm{l}^2
    \end{pmatrix}
    \end{align}\)
    .
  • Parameters. There is one vector-valued parameter, \(μ\), and a matrix-valued parameter, \(Σ\), referred to respectively as the mean and covariance matrix. The covariance matrix is symmetric and strictly positive definite.
  • Support. The K-variate Gaussian distribution is supported on \(\mathbb{R}^K\).
  • Probability density function.

    \(\begin{align}
    f(\mathbf{y};\boldsymbol{\mu}, \mathsf{\Sigma}) = \frac{1}{\sqrt{(2\pi)^K \det \mathsf{\Sigma}}}\,\mathrm{exp}\left[-\frac{1}{2}(\mathbf{y} - \boldsymbol{\mu})^T \cdot \mathsf{\Sigma}^{-1}\cdot (\mathbf{y} - \boldsymbol{\mu})\right].
    \end{align}\)


  • Usage The usage below assumes that mu is a length K array, Sigma is a K×K symmetric positive definite matrix, and L is a K×K lower-triangular matrix with strictly positive values on teh diagonal that is a Cholesky factor.

  • Package Syntax
    NumPy np.random.multivariate_normal(mu, Sigma)
    SciPy scipy.stats.multivariate_normal(mu, Sigma)
    Stan multi_normal(mu, Sigma)
    NumPy Cholesky np.random.multivariate_normal(mu, np.dot(L, L.T))
    SciPy Cholesky scipy.stats.multivariate_normal(mu, np.dot(L, L.T))
    Stan Cholesky multi_normal_cholesky(mu, L)


  • Related distributions.
    • The Multivariate Gaussian is a generalization of the univariate Gaussian.
  • Notes.
  • The covariance matrix may also be written as \(Σ=S⋅C⋅S\), where
    \(S=\sqrt{diag(Σ)}\),
    and entry \(i\), \(j\) in the correlation matrix C is

    \(C_{ij}=σ_{ij}/σ_iσ_j\).

    Furthermore, because \(Σ\) is symmetric and strictly positive definite, it can be uniquely defined in terms of its Cholesky decomposition, \(L\), which satisfies \(Σ=L ⋅ LT\). In practice, you will almost always use the Cholesky representation of the Multivariate Gaussian distribution in Stan.