The Decision Tree algorithm is a classification and regression algorithm built for both discrete and continuous attribute predictions. The model works by building a tree with a number of splits or nodes. There is a new split or node added every time a column is found to be significantly related to the predictable columns.
The feature selection process, as explained in the previous section, is used to select the most useful attributes because the inclusion of too many attributes might cause a degradation of the performance during the processing of an algorithm and will eventually lead to low memory. When the performance is paramount, the following methods can:
Increase the value of the
COMPLEXITY_PENALTY
parameter to limit the tree growthLimit the number of items in the association models to limit the number of trees built
Increase the value of the
MINIMUM_SUPPORT
parameter to avoid overfittingRestrict the number of discrete values for any attribute to...