In Chapter 9, Linear Regression we discussed regression. In the previous chapter, we were interested in classification using k-NN and Naïve Bayes. In this chapter, we will continue the topic of classification and discuss it in the context of decision trees. Decision trees notably allow class predictions (group membership) of previously unseen observations (testing datasets or prediction datasets) using statistical criteria applied on the seen data (training set).
Here, we will briefly examine the statistical criteria of six algorithms:
ID3
C4.5
C5.0
Classification and regression trees (CART)
Random forest
Conditional inference trees
We will also examine how to use decision trees in R, notably, how to measure the reliability of the classifications using training and test sets.