Classification
| Site: | Saylor University |
| Course: | CS207: Fundamentals of Machine Learning |
| Book: | Classification |
| Printed by: | Guest user |
| Date: | Monday, April 6, 2026, 4:56 AM |
Description
Classification
Classification in machine learning assigns data points to predefined categories, unlike regression, which predicts continuous values. When working with classification algorithms, you create decision boundaries that separate different classes, whether in binary classification (spam/not spam emails) or multi-class problems (categorizing customer feedback into positive/neutral/negative sentiments).
In real-world applications like fraud detection, many classification models output probabilities rather than fixed labels. You must choose a threshold to convert these probabilities into decisions. Adjusting the standard 0.5 threshold affects your results based on business priorities.
When might you prefer a low or high threshold? When screening loan applications, a lower threshold catches more potential defaults but might reject valid applications. In comparison, a higher threshold approves more loans but might miss some risky cases. You must consider which errors are more costly, false alarms or missed detections, as this drives your threshold selection.
In the Logistic regression module, you learned how to use the sigmoid function to convert raw model output to a value between 0 and 1 to make probabilistic predictions - for example, predicting that a given email has a 75% chance of being spam. But what if your goal is not to output probability but a category - for example, predicting whether a given email is "spam" or "not spam"?
Classification is the task of predicting which of a set of classes (categories) an example belongs to. In this module, you'll learn how to convert a logistic regression model that predicts a probability into a binary classification model that predicts one of two classes. You'll also learn how to choose and calculate appropriate metrics to evaluate the quality of a classification model's predictions. Finally, you'll get a brief introduction to multi-class classification problems, which are discussed in more depth later in the course.
Source: Google for Developers, https://developers.google.com/machine-learning/crash-course/classification
This work is licensed under a Creative Commons Attribution 4.0 License.
Thresholds and the Confusion Matrix
Let's say you have a logistic regression model for spam-email detection that predicts a value between 0 and 1, representing the probability that a given email is spam. A prediction of 0.50 signifies a 50% likelihood that the email is spam, a prediction of 0.75 signifies a 75% likelihood that the email is spam, and so on.
You'd like to deploy this model in an email application to filter spam into
a separate mail folder. But to do so, you need to convert the model's raw
numerical output (e.g., 0.75) into one of two categories: "spam" or "not
spam".
To make this conversion, you choose a threshold probability, called a
classification threshold.
Examples with a probability above the threshold value are then assigned
to the positive class,
the class you are testing for (here, spam). Examples with a lower
probability are assigned to the negative class,
the alternative class (here, not spam).
You may be wondering: what happens if the predicted score is equal to the classification threshold (for instance, a score of 0.5 where the classification threshold is also 0.5)? Handling for this case depends on the particular implementation chosen for the classification model. The Keras library predicts the negative class if the score and threshold are equal, but other tools/frameworks may handle this case differently.
Suppose the model scores one email as 0.99, predicting that email has a 99% chance of being spam, and another email as 0.51, predicting it has a 51% chance of being spam. If you set the classification threshold to 0.5, the model will classify both emails as spam. If you set the threshold to 0.95, only the email scoring 0.99 will be classified as spam.
While 0.5 might seem like an intuitive threshold, it's not a good idea if the cost of one type of wrong classification is greater than the other, or if the classes are imbalanced. If only 0.01% of emails are spam, or if misfiling legitimate emails is worse than letting spam into the inbox, labeling anything the model considers at least 50% likely to be spam as spam produces undesirable results.
Confusion matrix
The probability score is not reality, or ground truth. There are four possible outcomes for each output from a binary classifier. For the spam classifier example, if you lay out the ground truth as columns and the model's prediction as rows, the following table, called a confusion matrix, is the result:
| Actual positive | Actual negative | |
|---|---|---|
| Predicted positive | True positive (TP): A spam email correctly classified as a spam email. These are the spam messages automatically sent to the spam folder. | False positive (FP): A not-spam email misclassified as spam. These are the legitimate emails that wind up in the spam folder. |
| Predicted negative | False negative (FN): A spam email misclassified as not-spam. These are spam emails that aren't caught by the spam filter and make their way into the inbox. | True negative (TN): A not-spam email correctly classified as not-spam. These are the legitimate emails that are sent directly to the inbox. |
Notice that the total in each row gives all predicted positives (TP + FP) and all predicted negatives (FN + TN), regardless of validity. The total in each column, meanwhile, gives all real positives (TP + FN) and all real negatives (FP + TN) regardless of model classification.
When the total of actual positives is not close to the total of actual negatives, the dataset is imbalanced. An instance of an imbalanced dataset might be a set of thousands of photos of clouds, where the rare cloud type you are interested in, say, volutus clouds, only appears a few times.
Effect of threshold on true and false positives and negatives
Different thresholds usually result in different numbers of true and false positives and true and false negatives. The following video explains why this is the case.
Check your understanding
-
1. Imagine a phishing or malware classification model where phishing and malware websites are in the class labeled 1 (true) and harmless websites are in the class labeled 0 (false). This model mistakenly classifies a legitimate website as malware. What is this called?
- A true positive
- A false negative
- A false positive
-
A true negative
A negative example (legitimate site) has been wrongly classified as a positive example (malware site).
-
In general, what happens to the number of false positives when the classification threshold increases? What about true positives? Experiment with the slider above.
- Both true and false positives decrease.
- True positives increase. False positives decrease.
-
Both true and false positives increase.
As the threshold increases, the model will likely predict fewer positives overall, both true and false. A spam classifier with a threshold of .9999 will only label an email as spam if it considers the classification to be at least 99.99% likely, which means it is highly unlikely to mislabel a legitimate email, but also likely to miss actual spam email.
- In general, what happens to the number of false negatives when the classification threshold increases? What about true negatives? Experiment with the slider above.
- True negatives increase. False negatives decrease.
- Both true and false negatives decrease.
-
Both true and false negatives increase.
As the threshold increases, the model will likely predict more negatives overall, both true and false. At a very high threshold, almost all emails, both spam and not-spam, will be classified as not-spam.