Probability Distributions and their Stories
Discrete distributions
Negative Binomial distribution
- Story. We perform a series of Bernoulli trials. The number of failures,
, before we get
successes is Negative Binomially distributed. An equivalent story is that the sum of
independent and identically Gamma distributed variables is Negative Binomial distributed.
- Example.
Bursty gene expression can give mRNA count distributions that are
Negative Binomially distributed. Here, "success" is that a burst in gene
expression stops. So, the parameter
is the mean length of a burst in expression. The parameter α is related to the frequency of the bursts. If multiple bursts are possible within the lifetime of mRNA, then
. Then, the number of "failures" is the number of mRNA transcripts that are made in the characteristic lifetime of mRNA.
- Parameters. There are two parameters:
, the desired number of successes, and
, which is the mean of the α identical Gamma distributions that give the Negative Binomial. The probability of success of each Bernoulli trial is given by
.
- Support. The Negative-Binomial distribution is supported on the set of nonnegative integers.
- Probability mass function.
Here, we use a combinatorial notation;
Generally speaking, α need not be an integer, so we may write the PMF as
- Usage
Package | Syntax |
---|---|
NumPy | np.random.negative_binomial(alpha, beta/(1+beta)) |
SciPy | scipy.stats.nbinom(alpha, beta/(1+beta)) |
Stan | neg_binomial(alpha, beta) |
Stan with(μ,ϕ) parametrization | neg_binomial_2(mu, phi) |
- Related distributions.
- The Geometric distribution is a special case of the Negative Binomial distribution in which
and
.
- The continuous analog of the Negative Binomial distribution is the Gamma distribution.
- In a certain limit, which is easier implemented using the
parametrization below, the Negative Binomial distribution becomes a Poisson distribution.
- The Geometric distribution is a special case of the Negative Binomial distribution in which
- Notes.
- The Negative Binomial distribution may be parametrized such that the probability mass function is
These parameters are related to the parametrization above byand
. In the limit of
, which can be taken for the PMF, the Negative Binomial distribution becomes Poisson with parameter
. This also gives meaning to the parameters
and
.
is the mean of the Negative Binomial, and
controls extra width of the distribution beyond Poisson. The smaller
is, the broader the distribution.
- In Stan, the Negative Binomial distribution using the
parametrization is called neg_binomial_2.
- SciPy and NumPy use yet another parametrization. The PMF for SciPy is
The parameteris the probability of success of a Bernoulli trial. The parameters are related to the others we have defined by
and
.
- The Negative Binomial distribution may be parametrized such that the probability mass function is
params = [dict(name='α', start=1, end=20, value=5, step=1),
dict(name='β', start=0, end=5, value=1, step=0.01)]
app = distribution_plot_app(x_min=0,
x_max=50,
scipy_dist=st.nbinom,
params=params,
transform=lambda alpha, beta: (alpha, beta/(1+beta)),
x_axis_label='y',
title='Negative Binomial')
bokeh.io.show(app, notebook_url=notebook_url)