Book Image

Clojure for Data Science

By : Henry Garner
Book Image

Clojure for Data Science

By: Henry Garner

Overview of this book

Table of Contents (18 chapters)
Clojure for Data Science
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
Index

Ensemble learning and random forests


Ensemble learning combines the output from multiple models to obtain a better prediction than could be obtained with any of the models individually. The principle is that the combined accuracy of many weak learners is greater than any of the weak learners taken individually.

Random forests is an ensemble learning algorithm devised and trademarked by Leo Breiman and Adele Cutler. It combines multiple decision trees into one large forest learner. Each tree is trained on the data using a subset of the available features, meaning that each tree will have a slightly different view of the data and is capable of generating a different prediction from that of its peers.

Creating a Random Forest in clj-ml simply requires that we alter the arguments to cl/make-classifier to :decision-tree, :random-forest.

Bagging and boosting

Bagging and boosting are two opposing techniques for creating ensemble models. Boosting is the name for a general technique of building an ensemble...