Lecture 6

IntroductionIntroductiontotoProbabilityProbabilityandandProbabilityProbabilityDistributionsDistributions

DescriptiveDescriptiveStatisticsStatistics

Remark

Histograms, measures of central tendency and variation have a role to play, but we would like to go beyond just describing the world; we need reasoning and decision making. Reasoning requires models. So the frequency distribution built from samples will be replaced by probability distributions describing (modelling) a population.

With models we will be able to predict how likely events (or samples) are to occur. We will also be able to judge the truthfulness of statements about populations or plan ahead for interventions. Because of variability of the measured values across individuals the models have to be probabilistic. Probabilities are the language of uncertainty and we will use them to quantify the levels of confidence we have in statements and conclusions about populations. Probability theory is the foundation on which the whole structure of statistics stands.

Remark

There is a large number of probabilistic models people who use statistics can deploy. We will focus on three foundational probability distributions: Binomial, Poisson, and Normal. We will start our journey with simpler examples where the probabilities are easy to calculate and interpret.

ProbabilityProbability

Example 1

A large apple farm in Saint-Joseph-du-Lac has $5\,000$ fruit bearing apple trees. $3\,000$ of the trees produce red apples and $2\,000$ produce green apples. $500$ of the trees produce soft red apples and $1\,500$ trees produce soft green apples. The other trees produce hard apples (red or green).

Say, you buy the fruit from one tree chosen at random. What is the probability you will get a tree producing hard green apples?

It is useful to organize the information in a table. In the body of the table we will record the given values and in the right and bottom margins we will record the totals.

$$ \begin{array}{l|cc|c} & \text { Soft } & \text { Hard } & \\ \hline \text { Red } & 500 & 2500 & 3000 \\ \text { Green } & 1500 & 500 & 2000 \\ \hline & 2000 & 3000 & 5000 \end{array}$$Now we can answer the question. The probability of getting a hard green apple is given by the ratio of the number of trees producing hard green apples to the total number of trees.$$P(HG)=\frac{500}{5000}=0.1 \quad \Rightarrow \quad 10\% $$Here and in the rest of the course we will use the notation $P(A)$ to denote the probability of event $A$.

Example 2

Continued from above

What is the probability you will get a tree with soft apples?

$$P(S)=\frac{2000}{5000}=0.4 \quad \Rightarrow \quad 40\% $$

Example 3

Continued from above

Selecting a random tree from this farm (and we care about both color and texture, but not about the particular tree selected) could result in the following four outcomes:$$\lbrace HR,\, HG,\, SR,\, SG \rbrace $$Here, $HR$ stands for Hard Red apples, etc.

Definition 1

The set of outcomes of an experiment is called a sample space. The individual outcomes are called sample points. An event is a subset (collection) of sample points.

Example 4

Continued from above

In the apple example, at the most detailed level of probabilistic modelling, the sample points correspond to the $5\,000$ trees on the farm. However, if we care only about the color and the texture we will consider events such as $\lbrace HG\rbrace $ and $\lbrace S\rbrace $. Each of these events contains many sample points.

Example 5

Here is an example with a much smaller number of sample points. Toss a loonie and a toonie. Check the side facing up on both coins. What is the sample space?$$\lbrace HH,\, HT,\, TH,\, TT\rbrace $$Here, $HH$ stands for the loonie showing Head and the toonie also showing Head, etc.

Example 6

Draw a random card from a well shuffled deck.

How many sample points are there?$\quad 52$
Describe the sample points in the event $Jack$. $ \quad Jack=\lbrace \, JD,\, JC ,\, JH, \,JS\rbrace $

BooleanBooleanAlgebraAlgebraofofEventsEvents

Definition 2

The set of sample points belonging to both event $A$ and event $B$ is theintersectionof these two events, denoted by $A \cap B $.

ABA ∩ B

Diagrams of this type are called Venn diagrams.

Example 7

Soft: $\,S=\lbrace S \rbrace = \lbrace SR, SG\rbrace $
Red: $\,R=\lbrace R\rbrace =\lbrace HR, SR\rbrace $
Soft and Red: $S\cap R=\lbrace SR\rbrace$

Example 8

$ \,\, F = $ Face cards (Jacks, Queens, Kings), $ \,\, H = $ Hearts
$$F\cap H =\lbrace JH,\, QH,\, KH \rbrace $$

Definition 3

The set of sample points belonging to either event $A$ or event $B$ is theunionof these two events, denoted by $A \cup B $.

AB

Example 9

Soft:$\,\, S,\;$ Red: $\,\, R$$$S\cup R=\lbrace SR,\, SG, \, HR\rbrace $$

Example 10

Aces:$\,\, A,\;$ Hearts: $\,\, H$$$A\cup H=\lbrace AH,\, AS,\, AC,\, AD,\, 2H,\, 3H,\dots\, KH\rbrace $$ Notice that the Ace of Hearts appears only once in the union.

Remark

To recap, the union of two events contains all the sample points that belong to either of the two events. On the other hand the intersection of two events contains only the sample points that belong to both events.

Definition 4

The set of sample points which do not belong to event $A$ form thecomplementof this event, denoted by $A^c $ or by $A' $.

AA'

Example 11

$S^c=H,\; $ $R^c= G$
$SR^c=\lbrace HG,\, HR,\, SG\rbrace $

AssigningAssigningProbabilitiesProbabilities

Remark

Probabilities are initially assigned to the sample points either by counting, or empirically or subjectively. The empirical approach is based on the relative frequency of the sample points in a large number of trials.

After that probabilities of the events in the sample space are computed by using the simple rules below. These rules are called Axioms and reflect the intuitive notion of probability.

Axioms

1. For any event $A,\quad 0\leq P(A)\leq 1$

2. $P(S)=1, \; $ where $S \,$ is the whole sample space

3. If $A \cap B =\emptyset $ then $P(A\cup B)=P(A)+P(B)$

Remark

The symbol $\emptyset$ denotes the empty event, i.e. the event that contains no sample points.

Remark

The rules below, which are useful in calculating probabilities of events follow from the Axioms.

Theorem

1. $P(A^c) = 1-P(A)$

2. $P(\emptyset)=0 \;$

3. $P(A\cup B)=P(A)+P(B)-P(A\cap B) $

Example 12

Apple Farm

$$ \begin{array}{l|cc|c} & \text { Soft } & \text { Hard } & \\ \hline \text { Red } & 500 & 2500 & 3000 \\ \text { Green } & 1500 & 500 & 2000 \\ \hline & 2000 & 3000 & 5000 \end{array}$$$$\begin{aligned} P(SR) &=\frac{500}{5000}=0.1=10 \% \\ P(H R) &=\frac{2500}{5000}=0.5=50\% \\ P(SG)&=\frac{1500}{5000}=0.3=30 \% \\ P(HG)&=\frac{500}{5000}=0.1=10 \% \\ P(R)&=\frac{3000}{5000}=0.6=60 \% \end{aligned} $$$$P(R)=P(S R)+P(H R) = 0.1+0.5=0.6 $$$$P\left(R^{\prime}\right)=1-P(R) =1-0.6 =0.4$$$$P\left(R^{\prime}\right)=P(G)=\frac{2000}{5000}=0.4 $$$$\begin{aligned} P(R\cup S)&=P(S R)+P(H R)+P(S G)=0.1+0.5+0.3=0.9 \\ P(R \cup S)&=P(R)+P(S)-P(S R)=0.6+0.4-0.1=0.9\end{aligned} $$

Example 13

Drawing a card

$\begin{aligned} & P(H)=\frac{13}{52}=\frac{1}{4}, \\ &P\left(H^{\prime}\right)=1-P(H)=1-\frac{1}{4}=\frac{3}{4} \\& P(H \cap S)=P(\emptyset)=0 \\& P(H \cap F)=\frac{3}{52} \\& P(H \cup F)=P(H)+P(F)-P(H \cap F)=\frac{13}{52}+\frac{12}{52}-\frac{3}{52}=\frac{22}{52}=\frac{11}{26} \end{aligned}$