Probability Distributions and their Stories

Given any module that deals with statistics, one basic skill you must have is to be able to program and create plots of probability distributions typically encountered in the field of data science. This tutorial should remind you of various distributions introduced in this section, but now they are phrased using the scipy.stats module.

Continuous distributions

Beta distribution

  • Story. Say you wait for two multistep processes to happen. The individual steps of each process happen at the same rate, but the first multistep process requires α steps and the second requires β steps. The fraction of the total waiting time take by the first process is Beta distributed.
  • Example.
  • Parameters. There are two parameters, both strictly positive: α and β, defined in the above story.
  • Support. The Beta distribution has support on the interval [0, 1].
  • Probability density function.

    \begin{align}f(\theta;\alpha, \beta) = \frac{\theta^{\alpha-1}(1-\theta)^{\beta-1}}{B(\alpha,\beta)}\end{align}\

    where

    \begin{align}B(\alpha,\beta) = \frac{\Gamma(\alpha)\,\Gamma(\beta)}{\Gamma(\alpha + \beta)}\end{align}

    is the Beta function.
  • Usage

    Package Syntax
    NumPy np.random.beta(alpha, beta)
    SciPy scipy.stats.beta(alpha, beta)
    Stan weibull(alpha, sigma)


  • Related distributions.
    • The Uniform distribution on the interval [0, 1] is a special case of the Beta distribution with α=β=1.
  • Notes.
    • The story of the Beta distribution is difficult to parse. Most importantly for our purposes, the Beta distribution allows us to put probabilities on unknown probabilities. It is only defined on 0≤x≤1, and x here can be interpreted as a probability, say of a Bernoulli trial.
    • The case where a=b=0 is not technically a probability distribution because the PDF cannot be normalized. Nonetheless, it can be used as an improper prior, and this prior is known a Haldane prior, names after biologist J. B. S. Haldane. The case where a=b=1/2 is sometimes called a Jeffreys prior.

params = [dict(name='α', start=0.01, end=10, value=1, step=0.01),
          dict(name='β', start=0.01, end=10, value=1, step=0.01)]
app = distribution_plot_app(x_min=0,
                            x_max=1,
                            scipy_dist=st.beta,
                            params=params,
                            x_axis_label='θ',
                            title='Beta')
bokeh.io.show(app, notebook_url=notebook_url)