We have the following data about the shopping preferences of our friend, Jane:
Temperature | Rain | Shopping |
Cold | None | Yes |
Warm | None | No |
Cold | Strong | Yes |
Cold | None | No |
Warm | Strong | No |
Warm | None | Yes |
Cold | None | ? |
We would like to find out, using a decision tree, whether Jane would go shopping if the outside temperature was cold with no rain.
Here, we should be careful, as there are instances of the data that have the same values for the same attributes, but different classes; that is, (cold,none,yes)
and (cold,none,no)
. The program we made would form the following decision tree:
Root ├── [Temperature=Cold] │ ├──[Rain=None] │ │ └──[Shopping=Yes] │ └──[Rain=Strong] │ └──[Shopping=Yes] └── [Temperature=Warm] ├──[Rain=None] │ └──[Shopping=No] └── [Rain=Strong] └── [Shopping=No]
But at the leaf node [Rain=None]
with the parent [Temperature=Cold]
, there are two data samples with...