Evaluating classification models
Classification models are one of the most traditional classes of problems that you might face, either during the exam or during your journey as a data scientist. A very important artifact that you might want to generate during the classification model evaluation is known as a confusion matrix.
A confusion matrix compares your model predictions against the real values of each class under evaluation. Figure 8.1 shows what a confusion matrix looks like in a binary classification problem:
We find the following components in a confusion matrix:
- TP: This is the number of True Positive cases. Here, we have to count the number of cases that have been predicted as true and are, indeed, true. For example, in a fraud detection system, this would be the number of fraudulent transactions that were correctly predicted as fraud.
- TN: This is the number of True Negative cases. Here, we have...