Completion requirements
Given any module that deals with statistics, one basic skill you must have is to be able to program and create plots of probability distributions typically encountered in the field of data science. This tutorial should remind you of various distributions introduced in this section, but now they are phrased using the scipy.stats module.
Continuous Multivariate distributions
Lewandowski-Kurowicka-Joe (LKJ) distribution
- Story. Probability distribution for positive definite correlation matrices, or equivalanently for their Cholesky factors (which is what we use in practice).
- Parameter. There is one positive scalar parameter,
, which tunes the strength of the correlations. If
, the density is uniform over all correlation matrix. If
, matrices with a stronger diagonal (and therefore smaller correlations) are favored. If
, the diagonal is weak and correlations are favored.
- Support. The LKJ distribution is supported over the set of
Cholesky factors of real symmetric positive definite matrices.
- Probability density function. We cannot write the PDF in closed form.
- Usage
Package | Syntax |
---|---|
NumPy | not available |
SciPy | not available |
Stan | lkj_corr_cholesky(eta) |
- Notes.
- The most common use case is as a prior for a covariance matrix. Note that LKJ distribution gives Cholesky factors for correlation matrices, not covariance matrices. To get the covariance Cholesky factor from the correlation Cholesky factor, you need to multiply the correlation Cholesky factor by a diagonal constructed from the variances of the individual variates. Here is an example for a multivariate Gaussian in Stan.
parameters {<br> // Vector of means<br> vector[K] mu;<br><br> // Cholesky factor for the correlation matrix<br> cholesky_factor_corr[K] L_Omega;<br><br> // Sqrt of variances for each variate<br> vector<lower=0>[K] L_std;<br> }
model {<br> // Cholesky factor for covariance matrix<br> L_Sigma = diag_pre_multiply(L_std, L_Omega);<br><br> // Prior on Cholesky decomposition of correlation matrix<br> L_Omega ~ lkj_corr_cholesky(1);<br><br> // Prior on standard deviations for each variate<br> L_std ~ normal(0, 2.5);<br><br> // Likelihood<br> y ~ multi_normal_cholesky(mu, L_Sigma);<br> }