IntroductionIntroductiontotoProbabilityProbabilityandandProbabilityProbabilityDistributionsDistributions
DescriptiveDescriptiveStatisticsStatistics
Histograms, measures of central tendency and variation have a role to play, but we would like to go beyond just describing the world; we need reasoning and decision making. Reasoning requires models. So the frequency distribution built from samples will be replaced by probability distributions describing (modelling) a population.
With models we will be able to predict how likely events (or samples) are to occur. We will also be able to judge the truthfulness of statements about populations or plan ahead for interventions. Because of variability of the measured values across individuals the models have to be probabilistic. Probabilities are the language of uncertainty and we will use them to quantify the levels of confidence we have in statements and conclusions about populations. Probability theory is the foundation on which the whole structure of statistics stands.
There is a large number of probabilistic models people who use statistics can deploy. We will focus on three foundational probability distributions: Binomial, Poisson, and Normal. We will start our journey with simpler examples where the probabilities are easy to calculate and interpret.
ProbabilityProbability
A large apple farm in Saint-Joseph-du-Lac has $5\,000$ fruit bearing apple trees. $3\,000$ of the trees produce red apples and $2\,000$ produce green apples. $500$ of the trees produce soft red apples and $1\,500$ trees produce soft green apples. The other trees produce hard apples (red or green).
Say, you buy the fruit from one tree chosen at random. What is the probability you will get a tree producing hard green apples?
It is useful to organize the information in a table. In the body of the table we will record the given values and in the right and bottom margins we will record the totals.
Continued from above
What is the probability you will get a tree with soft apples?
$$P(S)=\frac{2000}{5000}=0.4 \quad \Rightarrow \quad 40\% $$Continued from above
Selecting a random tree from this farm (and we care about both color and texture, but not about the particular tree selected) could result in the following four outcomes:$$\lbrace HR,\, HG,\, SR,\, SG \rbrace $$Here, $HR$ stands for Hard Red apples, etc.
The set of outcomes of an experiment is called a sample space. The individual outcomes are called sample points. An event is a subset (collection) of sample points.
Continued from above
In the apple example, at the most detailed level of probabilistic modelling, the sample points correspond to the $5\,000$ trees on the farm. However, if we care only about the color and the texture we will consider events such as $\lbrace HG\rbrace $ and $\lbrace S\rbrace $. Each of these events contains many sample points.
Here is an example with a much smaller number of sample points. Toss a loonie and a toonie. Check the side facing up on both coins. What is the sample space?$$\lbrace HH,\, HT,\, TH,\, TT\rbrace $$Here, $HH$ stands for the loonie showing Head and the toonie also showing Head, etc.
Draw a random card from a well shuffled deck.
How many sample points are there?$\quad 52$
Describe the sample points in the event $Jack$. $ \quad Jack=\lbrace \, JD,\, JC ,\, JH, \,JS\rbrace $
BooleanBooleanAlgebraAlgebraofofEventsEvents
The set of sample points belonging to both event $A$ and event $B$ is theintersectionof these two events, denoted by $A \cap B $.
Diagrams of this type are called Venn diagrams.
Soft: $\,S=\lbrace S \rbrace = \lbrace SR, SG\rbrace $
Red: $\,R=\lbrace R\rbrace =\lbrace HR, SR\rbrace $
Soft and Red: $S\cap R=\lbrace SR\rbrace$
$ \,\, F = $ Face cards (Jacks, Queens, Kings), $ \,\, H = $ Hearts
$$F\cap H =\lbrace JH,\, QH,\, KH \rbrace $$
The set of sample points belonging to either event $A$ or event $B$ is theunionof these two events, denoted by $A \cup B $.
Soft:$\,\, S,\;$ Red: $\,\, R$$$S\cup R=\lbrace SR,\, SG, \, HR\rbrace $$
Aces:$\,\, A,\;$ Hearts: $\,\, H$$$A\cup H=\lbrace AH,\, AS,\, AC,\, AD,\, 2H,\, 3H,\dots\, KH\rbrace $$ Notice that the Ace of Hearts appears only once in the union.
To recap, the union of two events contains all the sample points that belong to either of the two events. On the other hand the intersection of two events contains only the sample points that belong to both events.
The set of sample points which do not belong to event $A$ form thecomplementof this event, denoted by $A^c $ or by $A' $.
$S^c=H,\; $ $R^c= G$
$SR^c=\lbrace HG,\, HR,\, SG\rbrace $
AssigningAssigningProbabilitiesProbabilities
Probabilities are initially assigned to the sample points either by counting, or empirically or subjectively. The empirical approach is based on the relative frequency of the sample points in a large number of trials.
After that probabilities of the events in the sample space are computed by using the simple rules below. These rules are called Axioms and reflect the intuitive notion of probability.
1. For any event $A,\quad 0\leq P(A)\leq 1$
2. $P(S)=1, \; $ where $S \,$ is the whole sample space
3. If $A \cap B =\emptyset $ then $P(A\cup B)=P(A)+P(B)$
The rules below, which are useful in calculating probabilities of events follow from the Axioms.
1. $P(A^c) = 1-P(A)$
2. $P(\emptyset)=0 \;$
3. $P(A\cup B)=P(A)+P(B)-P(A\cap B) $
Apple Farm
$$ \begin{array}{l|cc|c} & \text { Soft } & \text { Hard } & \\ \hline \text { Red } & 500 & 2500 & 3000 \\ \text { Green } & 1500 & 500 & 2000 \\ \hline & 2000 & 3000 & 5000 \end{array}$$$$\begin{aligned} P(SR) &=\frac{500}{5000}=0.1=10 \% \\ P(H R) &=\frac{2500}{5000}=0.5=50\% \\ P(SG)&=\frac{1500}{5000}=0.3=30 \% \\ P(HG)&=\frac{500}{5000}=0.1=10 \% \\ P(R)&=\frac{3000}{5000}=0.6=60 \% \end{aligned} $$$$P(R)=P(S R)+P(H R) = 0.1+0.5=0.6 $$$$P\left(R^{\prime}\right)=1-P(R) =1-0.6 =0.4$$$$P\left(R^{\prime}\right)=P(G)=\frac{2000}{5000}=0.4 $$$$\begin{aligned} P(R\cup S)&=P(S R)+P(H R)+P(S G)=0.1+0.5+0.3=0.9 \\ P(R \cup S)&=P(R)+P(S)-P(S R)=0.6+0.4-0.1=0.9\end{aligned} $$Drawing a card
$\begin{aligned} & P(H)=\frac{13}{52}=\frac{1}{4}, \\ &P\left(H^{\prime}\right)=1-P(H)=1-\frac{1}{4}=\frac{3}{4} \\& P(H \cap S)=P(\emptyset)=0 \\& P(H \cap F)=\frac{3}{52} \\& P(H \cup F)=P(H)+P(F)-P(H \cap F)=\frac{13}{52}+\frac{12}{52}-\frac{3}{52}=\frac{22}{52}=\frac{11}{26} \end{aligned}$