Random forests belong to a family of ensemble models. The ensemble models work on a premise that two brains are better than one; they combine the predictions of many weaker models (decision trees) to come up with a prediction that reflects a mode among these weaker models. For more, check https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm.
To execute this recipe, you will need pandas
and scikit-learn
. No other prerequisites are required.
As in previous examples, Scikit
provides an easy way of building a random forest classifier (the classification_randomForest.py
file):
import sklearn.ensemble as en @hlp.timeit def fitRandomForest(data): ''' Build a random forest classifier ''' # create the classifier object forest = en.RandomForestClassifier(n_jobs=-1, min_samples_split=100, n_estimators=10, class_weight="auto") # fit the data return forest.fit(data[0],data...