Basic Concepts of Probability

Read this section about basic concepts of probability, including spaces, and events. This section discusses set operations using Venn diagrams, including complements, intersections, and unions. Finally, it introduces conditional probability and talks about independent events.


LEARNING OBJECTIVES

  1. To learn the concept of a conditional probability and how to compute it.
  2. To learn the concept of independence of events, and how to apply it.


Conditional Probability

Suppose a fair die has been rolled and you are asked to give the probability that it was a five. There are six equally likely outcomes, so your answer is 1 / 6 . But suppose that before you give your answer you are given the extra information that the number rolled was odd. Since there are only three odd numbers that are possible, one of which is five, you would certainly revise your estimate of the likelihood that a five was rolled from 1 / 6 to 1 / 3. In general, the revised probability that an event A has occurred, taking into account the additional information that another event B has definitely occurred on this trial of the experiment, is called the conditional probability of A given B and is denoted by P(A \mid B). The reasoning employed in this example can be generalized to yield the computational formula in the following definition.


Definition

The conditional probability of A given B, denoted P(A \mid B), is the probability that event A has occurred in a trial of a random experiment for which it is known that event B has definitely occurred. It may be computed by means of the following formula:

Rule for Conditional Probability

\begin{align*}
P(A \mid B)=\frac{P(A \cap B)}{P(B)}
\end{align*}


EXAMPLE 20

A fair die is rolled.

a. Find the probability that the number rolled is a five, given that it is odd.

b. Find the probability that the number rolled is odd, given that it is a five.

Solution:

The sample space for this experiment is the set S=\{1,2,3,4,5,6\} consisting of six equally likely outcomes. Let F denote the event "a five is rolled" and let O denote the event "an odd number is rolled, "so that

\begin{align*}
F=\{5\} \quad \text { and } \quad O=\{1,3,5\}
\end{align*}

a. This is the introductory example, so we already know that the answer is 1 / 3. To use the formula in the definition to confirm this we must replace A in the formula (the event whose likelihood we seek to estimate) by F and replace B (the event we know for certain has occurred) by O :

\begin{align*}
P(F \mid O)=\frac{P(F \cap O)}{P(O)}
\end{align*}

Since F \cap O=\{5\} \cap\{1,3,5\}=\{5\}, P(F \cap O)=1 / 6

Since O=\{1,3,5\}, P(O)=3 / 6

Thus

\begin{align*}
P(F \mid O)=\frac{P(F \cap O)}{P(O)}=\frac{1 / 6}{3 / 6}=\frac{1}{3}
\end{align*}

b. This is the same problem, but with the roles of F and O reversed. Since we are given that the number that was rolled is five, which is odd, the probability in question must be 1. To apply the formula to this case we must now replace A (the event whose likelihood we seek to estimate) by O and B (the event we know for certain has occurred) by F :

\begin{align*}
P(O \mid F)=\frac{P(O \cap F)}{P(F)}
\end{align*}

Obviously P(F)=1 / 6. In part (a) we found that P(F \cap O)=1 / 6. Thus

\begin{align*}
P(O \mid F)=\frac{P(O \cap F)}{P(F)}=\frac{1 / 6}{1 / 6}=1
\end{align*}

Just as we did not need the computational formula in this example, we do not need it when the information is presented in a two-way classification table, as in the next example.


EXAMPLE 21

In a sample of 902 individuals under 40 who were or had previously been married, each person was classified according to gender and age at first marriage. The results are summarized in the following two-way classification table, where the meaning of the labels is:

  • M: male
  • F: female
  • E: a teenager when first married
  • W: in one’s twenties when first married
  • H: in one’s thirties when first married

E W H Total
M 43 293 114 450
F 82 299 71 452
Total 125 592 185 902

The numbers in the first row mean that 43 people in the sample were men who were first married in their teens, 293 were men who were first married in their twenties, 114 men who were first married in their thirties, and a total of 450 people in the sample were men. Similarly for the numbers in the second row. The numbers in the last row mean that, irrespective of gender, 125 people in the sample were married in their teens, 592 in their twenties, 185 in their thirties, and that there were 902 people in the sample in all. Suppose that the proportions in the sample accurately reflect those in the population of all individuals in the population who are under 40 and who are or have previously been married. Suppose such a person is selected at random.

a. Find the probability that the individual selected was a teenager at first marriage.

b. Find the probability that the individual selected was a teenager at first marriage, given that the person is male.

Solution:

It is natural to let E also denote the event that the person selected was a teenager at first marriage and to let M denote the event that the person selected is male.

a. According to the table the proportion of individuals in the sample who were in their teens at their first marriage is 125 / 902. This is the relative frequency of such people in the population, hence P(E)=125 / 902 \approx 0.139 or about 14%

b. Since it is known that the person selected is male, all the females may be removed from consideration, so that only the row in the table corresponding to men in the sample applies:

E W H Total
M 43 293 114 450

The proportion of males in the sample who were in their teens at their first marriage is 43 / 450. This is the relative frequency of such people in the population of males, hence P(E \mid M)=43 / 450 \approx 0.096 or about 10%.

In the next example, the computational formula in the definition must be used.


EXAMPLE 22

Suppose that in an adult population the proportion of people who are both overweight and suffer hypertension is 0.09; the proportion of people who are not overweight but suffer hypertension is 0.11; the proportion of people who are overweight but do not suffer hypertension is 0.02; and the proportion of people who are neither overweight nor suffer hypertension is 0.78. An adult is randomly selected from this population.

a. Find the probability that the person selected suffers hypertension given that he is overweight.

b. Find the probability that the selected person suffers hypertension given that he is not overweight.

c. Compare the two probabilities just found to give an answer to the question as to whether overweight people tend to suffer from hypertension.

Solution:

Let H denote the event "the person selected suffers hypertension". Let O denote the event "the person selected is overweight". The probability information given in the problem may be organized into the following contingency table:

O O^c
H 0.09 0.11
H^c 0.02 0.78

a. Using the formula in the definition of conditional probability,

\begin{align*}
    P(H \mid O)=\frac{P(H \cap O)}{P(O)}=\frac{0.09}{0.09+0.02}=0.8182
    \end{align*}

b. Using the formula in the definition of conditional probability,

\begin{align*}
    P\left(H \mid O^{c}\right)=\frac{P\left(H \cap O^{c}\right)}{P\left(O^{c}\right)}=\frac{0.11}{0.11+0.78}=0.1236
    \end{align*}

c. P(H \mid O)=0.8182 is over six times as large as P\left(H \mid O^{c}\right)=0.1236, which indicates a much higher rate of hypertension among people who are overweight than among people who are not overweight. It might be interesting to note that a direct comparison of P(H \cap O)=0.09 and P\left(H \cap O^{c}\right)=0.11 does not answer the same question.