Lecture 9

TheThePoissonPoissonDistributionDistribution

Remark

The Poisson probability distribution assigns probabilities to the counts of events that occur in a fixed interval of time or space. The events must be independent of each other; the occurence of one event does not affect the probability of the occurence of another event.

Example 1

  • Number of moss plants in a sampling quadrat on a hillside.
  • Number of sightings of a black squirrel in a back yard in a week in the summer.
  • Formula

    Let $X$ be the count for occurrences of independent events if we expect $ \mu $ events to occur. Then the Poisson formula gives the probability of observing $x$ events as:$$ P(X=x) = \frac{e^{-\mu}\, \mu^x}{x!} \;;\quad x=0,\,1,\,2\, \dots $$The mean of the Poisson distribution is $\mu$. The variance is also $\mu$: $\sigma^2 = \mu$.

    Example 2

    Tracking caribou herds from a helicopter, we expect to observe $3$ arctic foxes per day. What is the probability we will observe $5$ arctic foxes in a day?
    $$ P(X=5) = \frac{e^{-3}\, 3^5}{5!} = 0.1008 $$

    Example 3

    The number of potholes in 10m stretches on the service road of HW15 has been recorded in a frequency table:$$ \begin{array}{c|ccccc} \text{Number of potholes} & 0 & 1 & 2 & 3 & 4 \\ \hline \text{Frequency} & 48 & 43 & 27 & 10 & 2 \end{array} $$

  • What is the probability that there will be $0$ potholes in the next $10\,\mathrm{m}$ of road?
  • What is the probability that there will be $2$ potholes in the next $10\,\mathrm{m}$ of road?
  • Solution

    We will assume that the potholes occur independently and we will build a Poisson distribution to represent the number of potholes in a 10m stretch of road. We will use the frequency table to approximate the mean of the distribution with the sample mean.$$\begin{aligned} \mu = \bar{x} &= \frac{0\cdot 48 + 1\cdot 43 +2\cdot 27 + 3\cdot 10 + 4\cdot 2 }{130} =1.038 \\ \end{aligned} $$In a later lecture we will learn how to check if the Poisson distribution is a good fit for the data. In this example we will simply use the Poisson distribution to make predictions.$$\begin{aligned} P(X=0) &=\frac{e^{-1.038}\cdot (1.038)^0}{0!}=0.354 \\ &\\ P(X=2) &=\frac{e^{-1.038}\cdot (1.038)^2}{2!}=0.191 \end{aligned} $$


    We can also organize the probabilities in a table and draw a histogram. $$ \begin{array}{c|cccccc} x & 0 & 1 & 2 & 3 & 4 & 5 \\ \hline p(x) & 0.354 & 0.368 & 0.191 & 0.066 & 0.017 & 0.004 \end{array} $$

    Remark

    In a Poisson distribution, the mean and variance are equal. Thus the coefficient of dispersion is 1:$$CD=\frac{\sigma^2}{\mu}=1$$We can use this fact for a quick test of whether a sample is generated from a Poisson distribution. For a sample of data, if the coefficient of dispersion is near 1, then the Poisson distribution model is likely a good fit for the data.

    Example 4

    Consider again the data on the occurrence of potholes on a service road. We already computed the sample mean: $\bar{x}=1.038.$For the sample variance we will use the following formula which takes into account the frequency $f$ of each value:$$\begin{aligned} s^2 &=\frac{\sum f(x-\bar{x})^2}{\sum f-1} \\ &=\frac{48(0-1.038)^2+43(1-1.038)^2+27(2-1.038)^2+10(3-1.038)^2+2 \cdot(4-1.038)^2}{129} \\ &=1.030 \\ \end{aligned} $$$\because\quad s^2 \approx \bar{x} \quad \Rightarrow \quad $Looks like the Poisson will be a good fit for this data.

    Example 5

    This is a classical example. The data shows the number of men killed by being kicked by a horse in ten Prussian Army Corps in the course of 20 years (Bortkiewicz, 1898). For the Army Corps in twenty years, we have $10\times 20=200 $ samples. The data is as follows:$$ \begin{array}{c|ccccc} x & 0 & 1 & 2 & 3 & 4 \\ \hline \text { Freq. } & 109 & 65 & 22 & 3 & 1 \end{array} $$The sample mean is:$$\bar{x}=\frac{109\cdot 0+65\cdot 1+22\cdot 2+3\cdot 3+1\cdot 4}{200}=0.61$$$$\begin{aligned} s^2&=\frac{109(0-0.61)^2+65(1-0.61)^2+22(2-0.61)^2+3(3-0.61)^2+1(4-0.61)^2}{200-1} \\ &=0.611 \end{aligned} $$$CD=\frac{s^2}{\bar{x}}=\frac{0.611}{0.61}\approx 1.0016 \quad \Rightarrow \quad$ Looks like the Poisson distibution will be a fit.

    We can make predictions using the Poisson distribution: What is the probability that four men will be killed by a horse kick in a year?$$P(X=4)=\frac{e^{-0.61}\cdot (0.61)^4}{4!}=0.003 $$