Chi-SquareChi-SquareGoodnessGoodnessofofFitFitTestTest
Let us treat the historical dataset on the number of men killed by horse kicks in the Prussian Army formally. This sample has $n=200$ observations.$$ \begin{aligned} \begin{array}{c|ccccc} X & 0 & 1 & 2 & 3 & 4 \\ \hline \text { freq. } & 109 & 65 & 22 & 3 & 1 \end{array} \end{aligned} $$Previously we computed$$\overline{X}=0.61; \quad s^2=0.611; \quad CD=1.002 $$The coefficient of dispersion $CD$ is close to 1, so we expect that the Poisson distribution is a good fit for this data.
To run the $\chi^2-$goodness of fit test, we need to compute the expected frequency for each category. The expected frequency for each category is given by $E_i=nP(x_i)=200 P(x_i)$, where $P(x_i)$ is the probability of the $i^{th}$ category. These probabilities are computed using thePoisson distributionwith parameter $\mu=\overline X = 0.61$.$$ \begin{array}{c|cccc} x & 0 & 1 & 2 & \geq 3 \\ \hline \text { observed freq. } & 109 & 65 & 22 & 4 \\ \hline P(x) & 0.543 & 0.331 & 0.101 & 0.024 \\ \hline \text { predicted freq. } & 108.7 & 66.3 & 20.2 & 4.8 \end{array} $$Notice that the last two columns of the data table were merged. In $\chi^2-$goodness of fit test, we need to have at least 3 observations in each category. Cells containing less than 3 counts should be merged.
Next we run the $\chi^2-$goodness of fit test. The null and alternative hypotheses are:$\; H_0:$ Poisson is an appropriate model.
$\; H_1:$ Poisson is not an appropriate model.
The test statistic is given by$$\begin{aligned} \chi^2 &=\sum \frac{\left(0_i-E_i\right)^2}{E_i} \\ &=\frac{(109-108.7)^2}{108.7}+\frac{(65-66.3)^2}{66.3}+\frac{(22-20.2)^2}{20.2}+\frac{(4-4.8)^2}{4.8} \\ &=0.331 \end{aligned}$$The degrees of freedom for this test are equal to $k-m-1, $ where $k$ is number of cells and $ m$ is the number of parameters we estimated from the data. In this case, $m=1$ because we estimated one parameter, the population mean $\mu$ from the data. So, $df=4-1-1=2$.
Next we look up the $p-$value for the test statistics on Excel or in a $\chi^2-$table. The $p-$value is the probability of observing a test statistic as extreme as the one we observed, assuming the null hypothesis is true. In this case, Excel tells us that the $p-$value is $0.847$.$$ p-value =0.847 \qquad 0.50 \leq p-value \leq 0.90 $$Observing a dataset as extreme as the one we observed is not rare. Since the $p-$value is more than $0.05$, we fail to reject the null hypothesis $H_0$. The null hypothesis is that the Poisson is an appropriate distribution. There is no indication that the Poisson distribution is not an appropriate fit for this data.
Consider again the dataset consisting of $200$ families with $3$ children each. Let $X=$ the number of boys in a family. A standard computation yields$$\overline{X}=1.52; \quad s^2=0.753; \quad CD=0.5 $$The coefficient of dispersion $CD$ is less than 1, so we expect that the Binomial distribution is a good fit for this data. We estimate the parameter $p = \frac{\mu}{n}$ with $p=\frac{\overline{X}}{3}=0.5067.$ Next we compute the probabilities and the expected frequency for each category using the Binomial formula with parameter $p=0.5067$ and $n=3$. The expected frequency for each category is given by $E_i=200\cdot P(x_i)$.$$ \begin{array}{c|cccc} x & 0 & 1 & 2 & 3 \\ \hline \text { Observed freq. } & 24 & 74 & 76 & 26 \\ \hline P(x) & 0.120 & 0.370 & 0.380 & 0.130 \\ \hline \text { Predicted freq. } & 24.01 & 73.99 & 75.99 & 26.01 \end{array} $$$$ \chi^2 =\frac{(24-24.01)^2}{24.01}+\cdots+\frac{(26-26.01)^2}{26.01}=0.00002; \quad df=4-1-1=2 $$The $p-$value for the test statistics is$$ p-value =0.9999; \quad p-value > 0.95 $$The null hypothesis is that the Binomial is an appropriate distribution. Since the $p-$value is more than $0.05$, we fail to reject the null hypothesis. No indication that the Binomial distribution is not appropriate.
Consider again the number of aquatic invertebrates in a quadrat on the lake bottom. In the original table the count in the last category was less than 3. We merged the last two categories. A standard computation yields$$\overline{X}=0.68; \quad s^2=0.795; \quad CD=1.17 $$The coefficient of dispersion $CD$ is larger than 1, so we expect that the negative binomial distribution is a good fit for this data. To estimate of the parameters of the negative binomial distribution, we use the formulas from the previous lecture. We estimate the parameter $p=\frac{\overline{X}}{s^2}=0.856$ and $r=\frac{\overline{X}^2}{s^2-\overline{X}}=4.04$. Next we compute the probabilities and the expected frequency for each category using Excel.$$ \begin{array}{c|ccccc} x & 0 & 1 & 2 & 3 & 4\\ \hline \text { Observed freq. } & 213 & 128 & 37 & 18 & 4 \\ \hline p(x) & 0.533 & 0.310 & 0.113 & 0.033 & 0.008 \\ \hline \text { Predicted freq. } & 213.4 & 124.1 & 45.1 & 13.1 & 3.3 \end{array} $$The value for the test statistics is$$ \chi^2=\frac{(213-213.4)^2}{213.4}+\cdots+\frac{(4-3.3)^2}{3.3}=3.560 \quad df=4-2-1=1 $$The $p-$value for the test statistics is$$ p-value =0.0592; \quad 0.05 \leq p-value \leq 0.10 $$The null hypothesis is that the Negative Binomial is an appropriate distribution. Since the $p-$value is more than $0.05$, we fail to reject the null hypothesis. We found no sufficient evidence to claim that the Negative Binomial distribution is not an appropriate model for this data. Notice however that the $p-$value is very close to $0.05$ and our conclusion is very weak.