Conditional Probability

View

Formal derivation

Formally, P(A | B) is defined as the probability of A according to a new probability function on the sample space, such that outcomes not in B have probability 0 and that it is consistent with all original probability measures.

Let Ω be a discrete sample space with elementary events {ω}, and let P be the probability measure with respect to the σ-algebra of Ω. Suppose we are told that the event B ⊆ Ω has occurred. A new probability distribution (denoted by the conditional notation) is to be assigned on {ω} to reflect this. All events that are not in B will have null probability in the new distribution. For events in B, two conditions must be met: the probability of B is one and the relative magnitudes of the probabilities must be preserved. The former is required by the axioms of probability, and the latter stems from the fact that the new probability measure has to be the analog of P in which the probability of B is one - and every event that is not in B, therefore, has a null probability. Hence, for some scale factor α, the new distribution must satisfy:

\omega \in B:P(\omega \mid B)=\alpha P(\omega )

\omega \notin B:P(\omega \mid B)=0

\sum _{\omega \in \Omega }{P(\omega \mid B)}=1.

Substituting 1 and 2 into 3 to select α:

Substituting 1 and 2 into 3 to select α:

So the new probability distribution is

\omega \in B:P(\omega \mid B)={\frac {P(\omega )}{P(B)}}

\omega \notin B:P(\omega \mid B)=0

Now for a general event A,

Now for a general event A,