Completion requirements
Given any module that deals with statistics, one basic skill you must have is to be able to program and create plots of probability distributions typically encountered in the field of data science. This tutorial should remind you of various distributions introduced in this section, but now they are phrased using the scipy.stats module.
Continuous distributions
Gamma distribution
- Story. The amount of time we have to wait for
arrivals of a Poisson process. More concretely, if we have events,
,
, …,
that are exponentially distributed,
is Gamma distributed.
- Example. Any multistep process where each step happens at the same rate. This is common in molecular rearrangements.
- Parameters. The number of arrivals,
, and the rate of arrivals,
.
- Support. The Gamma distribution is supported on the set of positive real numbers.
- Probability density function.
- Related distributions.
- Usage
Package Syntax NumPy np.random.gamma(alpha, beta)
SciPy scipy.stats.gamma(alpha, loc=0, scale=beta)
Stan gamma(alpha, beta)
- Notes.
- The Gamma distribution is useful as a prior for positive parameters. It imparts a heavier tail than the Half-Normal distribution (but not too heavy; it keeps parameters from growing too large), and allows the parameter value to come close to zero.
- SciPy has a location parameter, which should be set to zero, with
being the scale parameter.
params = [dict(name='α', start=1, end=5, value=2, step=0.01),
dict(name='β', start=0.1, end=5, value=2, step=0.01)]
app = distribution_plot_app(x_min=0,
x_max=50,
scipy_dist=st.gamma,
params=params,
transform=lambda a, b: (a, 0, b),
x_axis_label='y',
title='Gamma')
bokeh.io.show(app, notebook_url=notebook_url)