We will now introduce classification and regression trees (CART) and random forest. CART uses different statistical criteria to decide on tree splits. Random forest uses ensemble learning (a combination of CART trees) to improve classification using a voting principle.
There are a number of differences between CART used for classification and the family of algorithms we just discovered. Here, we only superficially discuss the partitioning criterion and pruning.
In CART, the attribute to be partition is selected with the Gini index as a decision criterion. In classification trees, the Gini index is simply computed as: 1—the sum of the squared probabilities for each possible partition on the attribute. The formula notation is:
This is more efficient compared to information gain and information ratio. Note that CART does not only do necessary partitioning on the modalities of the attribute, but also merges modalities together for the partition...