Completion requirements
Given any module that deals with statistics, one basic skill you must have is to be able to program and create plots of probability distributions typically encountered in the field of data science. This tutorial should remind you of various distributions introduced in this section, but now they are phrased using the scipy.stats module.
Continuous distributions
Log-Normal distribution
- Story. If
is Gaussian distributed,
is Log-Normally distributed.
- Example. A measure of fold change in gene expression can be Log-Normally distributed.
- Parameters. As for a Gaussian, there are two parameters, the mean,
, and the variance
.
- Support. The Log-Normal distribution is supported on the set of positive real numbers.
- Probability density function.
- Usage
Package | Syntax |
---|---|
NumPy | np.random.lognormal(mu, sigma) |
SciPy | scipy.stats.lognorm(x, sigma, loc=0, scale=np.exp(mu)) |
Stan | lognormal(mu, sigma) |
- Notes.
- Be careful not to get confused. The Log-Normal distribution describes the distribution of
given that
is Gaussian distributed. It does not describe the distribution of
.
- The way location, scale, and shape parameters work in SciPy for the Log-Normal distribution is confusing. If you want to specify a Log-Normal distribution as we have defined it using scipy.stats, use a shape parameter equal to
, a location parameter of zero, and a scale parameter given by
. For example, to compute the PDF, you would use
scipy.stats.lognorm(x, sigma, loc=0, scale=np.exp(mu))
. - The definition of the Log_Normal in the
numpy.random
module matches what we have defined above and what is defined in Stan.
- The way location, scale, and shape parameters work in SciPy for the Log-Normal distribution is confusing. If you want to specify a Log-Normal distribution as we have defined it using scipy.stats, use a shape parameter equal to
params = [dict(name='µ', start=0.01, end=0.5, value=0.1, step=0.01),
dict(name='σ', start=0.1, end=1.0, value=0.2, step=0.01)]
app = distribution_plot_app(x_min=0,
x_max=4,
scipy_dist=st.lognorm,
params=params,
transform=lambda mu, sigma: (sigma, 0, np.exp(mu)),
x_axis_label='y',
title='Log-Normal')
bokeh.io.show(app, notebook_url=notebook_url)