Lecture 1

DataDataininBiology,Biology,VariablesVariables

SamplesSamplesandandpopulationspopulations

Definition 1

Individual observations are measurements taken on an individual.

Example 1

  • Record the weight, length, and age of a polar bear (these are individual observations).
  • Record the weight, length, and age of a hundred bears (this is a sample).
  • Definition 2

    Sample of observations is a collection of individual observations selected by a specific procedure.

    Definition 3

    An actual property measured by an individual observation is called a variable. More than one variable could be measured in an individual.

    Example 2

    For a sample of 25 deer observe blood pH and erythrocyte count; this is a measurement of 2 variables on a sample of 25 individuals.

    Remark

    In Biology the definition of the term population signifies all individuals of a given species. But in Statistics the term population has a different meaning.

    Definition 4

    Population in statistics means the totality of individual observations about which inferences are to be made.

    Example 3

  • Sample: 25 deer in the Laurentians.
  • Population: all deer in the Laurentians.
  • Example 4

      Consider a sample of leucocyte counts for 12 male urban coyotes in the greater Montreal area. What would be the population intended for study?

  • Coyotes in Quebec
  • Male coyotes
  • Urban coyotes in Quebec
  • Male urban coyotes in the greater Montreal area
  • Eastern coyotes

    • Answer: Male urban coyotes in the greater Montreal area. All the other options are too broad. For example previous studies have shown that male coyotes have higher leukocyte counts than female coyotes.

    Remark

    If we can get measurements of the relevant variables for all members of a population of interest we have a total census.

    Example 5

    Can we get the weight of all deer in the Laurentians. That would be a census. But it is not likely we can do it.

    Measuring the weight of all whooping cranes in North America might be within reach. Currently there are approximately 800 individuals and there were only 21 individuals in 1941.

    VariablesVariablesininBiologyBiology

    Remark

    In practice, variables considered in statistics are additionally required to capture a property with respect to which the individuals in the population under study differ in some assessable way. If a property does not differ among individuals in the population then it is not of statistical interest (and it is not considered a variable).

    Example 6

    All deer in Lapland have red noses (just like Rudolf). The color of the nose is not a variable in this population.

    Example 7

    In a population of mammals warm-bloodedness is not a variable, but body temperature is.

    Variables are classified as being qualitative or quantitative.


    A qualitative variable is a variable that is not numerical. Examples include gender (M or F), color, survival (dead or alive). Measurements of qualitative variables are usually combined into frequency tables.

    Example 8

    Frequency table for color in squirrels:

    $$ \begin{array}{c|c} \text { Color } & \text { Frequency } \\ \hline \text { Black } & 14 \\ \text { Grey } & 66 \\ \hline \text { Total } & 80\end{array} $$

    Quantitative variables are numerical, and they could be continuous or discrete.


    Continuous variables can take any value in an interval (if we can measure the variable with infinite precision).


    Discrete variables only have discrete values with no intermediate values possible.

    Example 9

    The weight of a polar bear is a continuous variable. The range of this variable lies in the interval $(0.25, 1010) \,kg$. The smallest newborn cubs weigh about $250 g$ and the largest polar bear ever recorded weighed $1010 kg$. Any value in between is possible.

    Example 10

    Number of offspring of a female deer (doe) is a discrete variable:

    $$\, 0,\, 1, \, 2,\dots \,$$

    This variable cannot have the value 6.35.

    Variables can also be described in terms of the scales of measurement. Different statistical methods are needed for different types of variables.

  • Nominal scale
  • Ordinal scale
  • Interval scale
  • Ratio scale
  • Definition 5

    Variables are recorded innominal scaleif they are in discrete categories without any implied ordering.

    Example 11

  • Color: White, Grey, Black
  • Leaf Shape: Lanceolate, Ovale, Palmate, etc.
  • Definition 6

    Variables are recorded inordinal scaleif they have discrete/categorical values that are ranked.

    Example 12

  • Parity: First born, Second born, ...
  • Structure: Not developed, poorly developed, well developed, hyper developed
  • Remark

    For variables in ordinal scale, expressing the values as a series of ranks, such as 1, 2, 3, 4, ... does not imply that the difference between 1 and 2 is the same as the difference between 3 and 4.

    Definition 7

    Interval scale is a numeric scale in which we know both the order and the exact difference between values.

    Example 13

    Temperature: The difference between $55^{\circ}C$ and $60^{\circ}C$ is the same as the difference between $162^{\circ}C$ and $167^{\circ}C$

    IQ Scores:$\; 92-96,\, 114-118,\, \dots $

    Definition 8

    Ratio scale is the ultimate nirvana of measurement because it tells us about order, value between units and ratios of values are meaningful.

    Example 14

  • Age: it makes sense to say the silverback is twice the age of the young pretender
  • Weight: it makes sense to say that a mama bear is five times heavier than her cub
  • Example 15

    Non-example

    pH of 3 is not twice as acidic as pH of 6, so this is not a ratio variable.

    Example 16

    Your bank account balance is an example of ratio-scale data. Both differences and ratios of balances have meaning. And a balance of 0 is meaningful. Although you can have a negative or positive account balance, there is a definite and nonarbitrary meaning of an account balance of 0.