Consider you are given the following hypothetical dataset containing data of patients: the size of the tumor in their body, their age, and a class that justifies whether they are affected by cancer or not, 1 being positive (affected by cancer) and 0 being negative (not affected by cancer):
Age |
Tumor size |
Class |
---|---|---|
22 |
135 |
0 |
37 |
121 |
0 |
18 |
156 |
1 |
55 |
162 |
1 |
67 |
107 |
0 |
73 |
157 |
1 |
36 |
123 |
0 |
42 |
189 |
1 |
29 |
148 |
0 |
Here, the patients are classified as cancer-affected or not. A new patient comes in at the age 17 and is diagnosed of having a tumor the of size 149. Now, you need to predict the classification of this new patient based on the previous data. That's classification for you as you need to predict the class of the dependent variable; here it is 0 or 1—you may also think of it as true or false.
For a regression problem, you predict a number, for example, the housing price or a numerical value. In a classification problem, you predict a categorical value...