Conditional Probability: Common fallacies | Saylor Academy

Common fallacies

These fallacies should not be confused with Robert K. Shope's 1978 "conditional fallacy", which deals with counterfactual examples that beg the question.

Assuming conditional probability is of similar size to its inverse

A geometric visualization of Bayes' theorem. In the table, the values 2, 3, 6 and 9 give the relative weights of each corresp

A geometric visualization of Bayes' theorem. In the table, the values 2, 3, 6 and 9 give the relative weights of each corresponding condition and case. The figures denote the cells of the table involved in each metric, the probability being the fraction of each figure that is shaded. This shows that P(A|B) P(B) = P(B|A) P(A) i.e. P(A|B) = P(B|A) P(A)/P(B) . Similar reasoning can be used to show that P(Ā|B) = P(B|Ā) P(Ā)/P(B) etc.

In general, it cannot be assumed that P(A|B) ≈ P(B|A). This can be an insidious error, even for those who are highly conversant with statistics. The relationship between P(A|B) and P(B|A) is given by Bayes' theorem:

\({\begin{aligned}P(B\mid A)&={\frac {P(A\mid B)P(B)}{P(A)}}\\\Leftrightarrow {\frac {P(B\mid A)}{P(A\mid B)}}&={\frac {P(B)}{P(A)}}\end{aligned}}\)

That is, P(A|B) ≈ P(B|A) only if P(B)/P(A) ≈ 1, or equivalently, P(A) ≈ P(B).

Assuming marginal and conditional probabilities are of similar size

In general, it cannot be assumed that P(A) ≈ P(A|B). These probabilities are linked through the law of total probability:

\(P(A)=\sum _{n}P(A\cap B_{n})=\sum _{n}P(A\mid B_{n})P(B_{n})\).

where the events \((B_{n})\) form a countable partition of \(\Omega\).

This fallacy may arise through selection bias For example, in the context of a medical claim, let SC be the event that a sequela (chronic disease) S occurs as a consequence of circumstance (acute condition) C. Let H be the event that an individual seeks medical help. Suppose that in most cases, C does not cause S (so that P(S_C) is low). Suppose also that medical attention is only sought if S has occurred due to C. From experience of patients, a doctor may therefore erroneously conclude that P(S_C) is high. The actual probability observed by the doctor is P(S_C|H).

Over- or under-weighting priors

Not taking prior probability into account partially or completely is called base rate neglect. The reverse, insufficient adjustment from the prior probability is conservatism.

Course Introduction

Course Syllabus

Unit 1: What Is Artificial Intelligence?

1.1: The Turing Test

The Turing Test for Intelligence

Why the Turing Test Is Important

1.2: The Four Types of AI

Is Intelligence How You Think or the Output of Thinking?

Unit 1 Assessment

Unit 1 Assessment

Unit 2: Agent-Based Approach to AI

2.1: Introduction to Agent-Based AI

Agents, Agent Types, and Their Capabilities

2.2: Analyzing Environmental Characteristics

Properties of Problem Environments and How to Analyze Them

Unit 2 Assessment

Unit 2 Assessment

Unit 3: Machine Learning and Its Importance

3.1: Learning in AI and Agents

Supervised, Unsupervised, and Reinforcement ML

3.2: Applications of ML in Neural Networks

Newer Machine Learning Models and Applications

Unit 3 Assessment

Unit 3 Assessment

Unit 4: Machine Learning Algorithms

4.1: Classification Algorithms

Classification versus Regression

Importance of Classification and Regression in Machine Learning

Classification Using K-nearest Neighbors Algorithm

4.2: Classification Algorithm Performance

False Positives / False Negatives / Confusion Matrix

Precision and Recall Calculations from Confusion Matrix

Linear Regression – How It Works

4.3: Linear Regression Algorithms

Metrics for Linear Regression Effectiveness: R-squared, MSE and RSE

Lasso and Ridge Regression

Improving Linear Regression by Reducing Residual Errors

4.4: Other Supervised ML Classification Algorithms

Classification Using Decision Trees

Classification Using Logistic Regression

Applying Bayes' Theorem in Machine Learning

4.5: Unsupervised Learning and Reinforcement Learning

Unlabelled Data and Unsupervised Machine Learning

Principles and Applications of Reinforcement Learning

4.6: ML Using Neural Networks

Introduction to Neural Networks Basics

Neural Networks: Types and Applications

Unit 4 Assessment

Unit 4 Assessment

Unit 5: Problem-Solving Methods in AI

5.1: Integrating ML Skills

Applying Classification to Determine Insurability

How Regression Is Applied in Contemporary Computing

Using Neural Networks in Cancer Detection

5.2: General AI Problem-Solver Architecture

Characteristics of General Problem-Solver

5.3: Designing a General Problem-Solving Agent

How GPS Is Used

Computational Tractability of GPS

Unit 5 Assessment

Unit 5 Assessment

Unit 6: Search Algorithms

6.1: Uninformed Search Algorithms

Uninformed or Brute Force Search

Depth First Search Algorithm

Breadth First Search Algorithm

Uniform Cost Search Algorithm

6.2: Heuristic Search Algorithms

Heuristics and Using Them to Improve Search

Overview of A* Search and Analysis of Performance

Unit 6 Assessment

Unit 6 Assessment

Unit 7: Iterative Improvement Algorithms

7.1: Using Iterative Improvement to Solve Problems

Iterative Improvement Algorithms and Hill-Climbing

Constraint Satisfaction Problems and Their Importance

7.2: Improving Algorithm Efficiency

How Simulated Annealing Improves Hill-Climbing

Improving Mediocre Solutions Using Genetic Algorithms

Unit 7 Assessment