Probability Distributions and their Stories

Given any module that deals with statistics, one basic skill you must have is to be able to program and create plots of probability distributions typically encountered in the field of data science. This tutorial should remind you of various distributions introduced in this section, but now they are phrased using the scipy.stats module.

Continuous distributions

Cauchy distribution

  • Story. The intercept on the x-axis of a beam of light coming from the point (μ,σ) is Cauchy distributed. This story is popular in physics (and is one of the first examples of Bayesian inference in Sivia's book), but is not particularly useful. You can think of it as a peaked distribution with enormously heavy tails.
  • Parameters. The Cauchy distribution is peaked, and its peak is located at μ. The peak's width is dictated by parameter σ.
  • Support. The Cauchy distribution is supported on the set of real numbers.
  • Probability density function.

    \begin{align}f(y;\mu, \sigma) = \frac{1}{\pi \sigma}\,\frac{1}{1 + (y-\mu)^2/\sigma^2}\end{align}

  • Usage

    Package Syntax
    NumPy mu + sigma * np.random.standard_cauchy()
    SciPy scipy.stats.cauchy(mu, sigma)
    Stan cauchy(mu, sigma)

  • Related distributions.
  • The Cauchy distribution is a special case of the Student-t distribution in which the degrees of freedom ν = 1.
  • The numpy.random module only has the Standard Cauchy distribution (μ=0 and σ=1), but you can draw out of a Cauchy distribution using the transformation shown in the NumPy usage above.
params = [dict(name='µ', start=-0.5, end=0.5, value=0, step=0.01),
          dict(name='σ', start=0.1, end=1.0, value=0.2, step=0.01)]
app = distribution_plot_app(x_min=-2,
                            x_max=2,
                            scipy_dist=st.cauchy,
                            params=params,  
                            x_axis_label='y',
                            title='Cauchy')
bokeh.io.show(app, notebook_url=notebook_url)