Probability Distributions and their Stories
Given any module that deals with statistics, one basic skill you must have is to be able to program and create plots of probability distributions typically encountered in the field of data science. This tutorial should remind you of various distributions introduced in this section, but now they are phrased using the scipy.stats module.
Discrete distributions
Poisson distribution
- Story. Rare events occur with a rate per unit time. There is no "memory" of previous events; i.e., that rate is independent of time. A process that generates such events is called a Poisson process. The occurrence of a rare event in this context is referred to as an arrival. The number of arrivals in unit time is Poisson distributed.
- Example. The number of mutations in a strand of DNA per unit length (since mutations are rare) are Poisson distributed.
- Parameter. The single parameter is the rate of the rare events occurring.
- Support. The Poisson distribution is supported on the set of nonnegative integers.
- Probability mass function.
- Usage
- Related distributions.
- In the limit of and such that the quantity is fixed, the Binomial distribution becomes a Poisson distribution with parameter . Thus, for large and small ,
Package | Syntax |
---|---|
NumPy | np.random.poisson(lam) |
SciPy | scipy.stats.poisson(lam) |
Stan | poisson(lam) |
params = [dict(name='λ', start=1, end=20, value=5, step=0.1)]
app = distribution_plot_app(x_min=0,
x_max=40,
scipy_dist=st.poisson,
params=params,
x_axis_label='n',
title='Poisson')
bokeh.io.show(app, notebook_url=notebook_url)