Conditional Probability

Use in inference

In statistical inference, the conditional probability is an update of the probability of an event based on new information. The new information can be incorporated as follows:

  • Let A, the event of interest, be in the sample space, say (X,P).
  • The occurrence of the event A knowing that event B has or will have occurred, means the occurrence of A as it is restricted to B, i.e. A\cap B.
  • Without the knowledge of the occurrence of B, the information about the occurrence of A would simply be P(A)
  • The probability of A knowing that event B has or will have occurred, will be the probability of A\cap B relative to P(B), the probability that B has occurred.
  • This results in P(A\mid B)=P(A\cap B)/P(B) whenever P(B) > 0 and 0 otherwise.

This approach results in a probability measure that is consistent with the original probability measure and satisfies all the Kolmogorov axioms. This conditional probability measure also could have resulted by assuming that the relative magnitude of the probability of A with respect to X will be preserved with respect to B (cf. a Formal Derivation below).

The wording "evidence" or "information" is generally used in the Bayesian interpretation of probability. The conditioning event is interpreted as evidence for the conditioned event. That is, P(A) is the probability of A before accounting for evidence E, and P(A|E) is the probability of A after having accounted for evidence E or after having updated P(A). This is consistent with the frequentist interpretation, which is the first definition given above.


Example

When Morse code is transmitted, there is a certain probability that the "dot" or "dash" that was received is erroneous. This is often taken as interference in the transmission of a message. Therefore, it is important to consider when sending a "dot", for example, the probability that a "dot" was received. This is represented by: P({\text{dot sent }}|{\text{ dot received}})=P({\text{dot received }}|{\text{ dot sent}}){\frac {P({\text{dot sent}})}{P({\text{dot received}})}}. In Morse code, the ratio of dots to dashes is 3:4 at the point of sending, so the probability of a "dot" and "dash" are P({\text{dot sent}})={\frac {3}{7}}\ and\ P({\text{dash sent}})={\frac {4}{7}}. If it is assumed that the probability that a dot is transmitted as a dash is 1/10, and that the probability that a dash is transmitted as a dot is likewise 1/10, then Bayes's rule can be used to calculate P({\text{dot received}}).

P({\text{dot received}})=P({\text{dot received }}\cap {\text{ dot sent}})+P({\text{dot received }}\cap {\text{ dash sent}})

P({\text{dot received}})=P({\text{dot received }}\mid {\text{ dot sent}})P({\text{dot sent}})+P({\text{dot received }}\mid {\text{ dash sent}})P({\text{dash sent}})

P({\text{dot received}})={\frac {9}{10}}\times {\frac {3}{7}}+{\frac {1}{10}}\times {\frac {4}{7}}={\frac {31}{70}}

Now, P({\text{dot sent }}\mid {\text{ dot received}}) can be calculated:

P({\text{dot sent }}\mid {\text{ dot received}})=P({\text{dot received }}\mid {\text{ dot sent}}){\frac {P({\text{dot sent}})}{P({\text{dot received}})}}={\frac {9}{10}}\times {\frac {\frac {3}{7}}{\frac {31}{70}}}={\frac {27}{31}}