Lecture 22

CIsCIsandandHypothesisHypothesisTestingTestingforforTheTheVarianceVariance

ConfidenceConfidenceintervalsintervalsforforthethevariancevarianceandandthethestandardstandarddeviationdeviation

Remark

For sample of size $n$ from a normal population with populations standard deviation $\sigma$, the following expression $(n-1) s^2/ \sigma^2$ in the sample variance $s^2$ has a $\chi^2$-distribution with $n-1$ degrees of freedom. To build a confidence interval for the population variance, we will assume that the sample variance does not fall in the tails of the $\chi^2$-distribution.
χ²-distributionχ²1-α/2χ²α/2
$$ P(\chi^2_{\alpha/2} < \frac{(n-1)\ s^2}{\sigma^2} < \chi^2_{1-\alpha/2}) = 1-\alpha $$Cross-multiplying, we get the following formula.

Formula

A $100(1-\alpha)$% confidence interval for the population variance $\sigma^2$ is given by$$ \frac{(n-1)s^2}{\chi^2_{\alpha/2}} \leq \sigma^2 \leq \frac{(n-1)s^2}{\chi^2_{1-\alpha/2}} $$where $\chi^2_{\alpha/2}$ and $\chi^2_{1-\alpha/2}$ are the critical values of the $\chi^2$-distribution with $n-1$ degrees of freedom. This confidence interval holds for samples from normal population.

Example 1

Consider the following data of $11$ measurements of pH of rain in Michigan:$$\begin{array}{6*{l}} 5.47 & 5.37 & 5.38 & 4.63 & 5.37 & 3.74 \\ 3.71 & 4.96 & 4.64 & 5.11 & 5.65 \end{array} $$Assuming that the pH of rain in Michigan is normally distributed, find a $95\%$ confidence interval for the population standard deviation.

Start by calculating the sample variance $s^2$ and the critical values of the $\chi^2$-distribution with $n-1=10$ degrees of freedom:$s^2 = 0.619$, $\chi^2_{0.025}=3.25$ and $\chi^2_{0.975}=20.48$.$$\frac{10\times 0.4508}{20.48} \leq \sigma^2 \leq \frac{10\times 0.4508}{3.25}$$$$0.2201 \leq \sigma^2 \leq 1.3871$$To compute the confidence interval for the standard deviation, we take the square roots:$$0.4691 \leq \sigma \leq 1.1777 \quad \text{with} \; 95\% \; \text{confidence.}$$

HypothesisHypothesisteststestsforforthethevariancevarianceandandthethestandardstandarddeviationdeviation

Remark

For a sample from normal population, the test statistic $\frac{(n-1)s^2}{\sigma^2}$ follows a $\chi^2$-distribution with $n-1$ degrees of freedom. We can use this distribution to test hypotheses about the population variance (and standard deviation). The null hypothesis is rejected if the test statistic falls within the rejection region.

Right-tailed test: $H_0: \sigma^2 = \sigma_0$ vs. $H_1: \sigma^2 > \sigma_0$
χ²αRejectionregion
Left-tailed test: $H_0: \sigma^2 = \sigma^2_0$ vs. $H_1: \sigma^2 < \sigma^2_0$
χ²1-αRejectionregion
Two-tailed test: $H_0: \sigma^2 = \sigma^2_0$ vs. $H_1: \sigma^2 \neq \sigma^2_0$
χ²1-α/2χ²α/2RejectionregionRejectionregion

Example 2

An automatic filling machine fills bottles with beer. A random sample of $20$ bottles has a sample variance of $2.957 \; ml^2$. If the machine fill variance is greater than $2 \; ml^2$, the machine is considered out of control. Test the hypothesis that the machine is out of control at the $5\%$ level of significance.

The null and alternative hypotheses are:$$ H_0: \sigma^2 = 2 \; ml^2 \quad \text{vs.} \quad H_1: \sigma^2 > 2 \; ml^2 $$The test statistic is:$$ \frac{(n-1)s^2}{\sigma^2} = \frac{19 \times 2.957}{2} = 28.09, \; df = n-1 = 19 $$The $p$-value is $P(\chi^2_{19} > 28.09) = 0.0817$ which is greater than $0.05$. Therefore, we fail to reject the null hypothesis. This sample does not provide enough evidence to conclude that the machine is out of control.

Example 3

The sugar content of the syrup in canned peaches is normally distributed and the manufacturer clams the standard deviation is $30 \; mg.$ A sample of dozen cans gives sample standard deviation of $78 \; mg$. Test $H_0: \sigma = 30 $ versus $H_1: \sigma > 30 $ at the $5\%$ level of significance.

The test statistic is:$$ \frac{(n-1)s^2}{\sigma^2} = \frac{11 \times 78^2}{30^2} = 74.36, \; df = n-1 = 11 $$$p$-value $= 1.8\times 10^{-11}$. Therefore, we (strongly) reject the null hypothesis. The sample provides strong evidence that the standard deviation of the sugar content in the syrup is greater than $30 \; mg$.

TestingTestingforforthetheratioratioofoftwotwovariancesvariances

Remark

To test the hypothesis that the ratio of two population variances is equal to a given value, we use the $f$-distribution. The population variances are assumed to be independent and normally distributed. The test statistic is the ratio of the sample variances $\frac{s_1^2}{s_2^2}$, which follows an $f$-distribution with $n_1-1$ and $n_2-1$ degrees of freedom. The null hypothesis is rejected if the test statistic falls within the rejection region.

Example 4

The levels of a certain hormone in two populations, one of young individuals and the other of old individuals, are normally distributed. A sample of $16$ young individuals has a sample standard deviation of $1.56$ and a sample of $19$ old individuals has a sample variance of $2.83$. Test $H_0: \sigma_1^2 = \sigma_2^2 $ versus $H_1: \sigma_1^2 < \sigma_2^2 $ at the $5\%$ level of significance.

The test statistic is:$$ f = \frac{s_1^2}{s_2^2} = \frac{1.56^2}{2.83^2} = 0.3039, \; df = (15, 18) $$$p$-value $= 0.012$. We reject the null hypothesis. The sample provides evidence that the variance of the hormone levels in young individuals is less than that of old individuals.