Completion requirements
Given any module that deals with statistics, one basic skill you must have is to be able to program and create plots of probability distributions typically encountered in the field of data science. This tutorial should remind you of various distributions introduced in this section, but now they are phrased using the scipy.stats module.
Discrete distributions
Binomial distribution
- Story. We perform \(N\) Bernoulli trials, each with probability \(θ\) of success. The number of successes, \(n\), is Binomially distributed.
- Example. Distribution of plasmids between daughter cells in cell division. Each of the \(N\) plasmids as a chance \(θ\) of being in daughter cell 1 ("success"). The number of plasmids, \(n\), in daughter cell 1 is binomially distributed.
- Parameters. There are two parameters: the probability \(θ\) of success for each Bernoulli trial, and the number of trials, \(N\).
- Support. The Binomial distribution is supported on the set of nonnegative integers.
- Probability mass function.
\(\begin{align}
f(n;N,\theta) = \begin{pmatrix}
N \\
n
\end{pmatrix}
\theta^n (1-\theta)^{N-n}
\end{align}\).
- Usage
Package Syntax NumPy np.random.binomial(N, theta)
SciPy scipy.stats.binom(N, theta)
Stan binomial(N, theta)
- Related distributions.
- The Bernoulli distribution is a special case of the Binomial distribution where \(N=1\).
- In the limit of \(N→∞\) and \(θ→0\) such that the quantity \(Nθ\) is fixed, the Binomial distribution becomes a Poisson distribution with parameter \(Nθ\).
- The Binomial distribution is a limit of the Hypergeometric distribution. Considering the Hypergeometric distribution and taking the limit of \(a+b→∞\) such that \(a/(a+b)\) is fixed, we get a Binomial distribution with parameters \(N=N\) and \(θ=a/(a+b)\).
params = [dict(name='N', start=1, end=20, value=5, step=1),
dict(name='θ', start=0, end=1, value=0.5, step=0.01)]
app = distribution_plot_app(x_min=0,
x_max=20,
scipy_dist=st.binom,
params=params,
x_axis_label='n',
title='Binomial')
bokeh.io.show(app, notebook_url=notebook_url)