EXERCISES
CLARIFYING THE CONCEPTS
- What is a decision tree?
- What is the difference between a decision node and a leaf node?
- In a decision tree, where is the most powerful of all possible splits made?
- When do decision trees stop growing?
- How do decision trees work?
- Would CART be a good algorithm to use if we are interested in a trinary categorical predictor?
- Which criterion is used by CART to assess which split is optimal?
- Which concept does the C5.0 algorithm use to select the optimal split?
- What are random forests?
- How do random forests work?
- Are all the predictor variables candidates to be the “best” split for each node in a tree built by random forests?
- Are the data sets used to build each tree in random forests the same?
- How does the random forests algorithm give the training data set its final classification?
WORKING WITH THE DATA
For Exercises 14–20, work with the adult_ch6_training and adult_ch6_test data sets. Use either Python or R to solve each problem...