Book Image

Predictive Analytics Using Rattle and Qlik Sense

By : Ferran Garcia Pagans, Fernando G Pagans
Book Image

Predictive Analytics Using Rattle and Qlik Sense

By: Ferran Garcia Pagans, Fernando G Pagans

Overview of this book

Table of Contents (16 chapters)
Predictive Analytics Using Rattle and Qlik Sense
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Underfitting and overfitting


Underfitting and overfitting are problems not just with a classifier but for all supervised methods.

Imagine you have a classifier with just one rule that tries to distinguish between healthy and not healthy patients. The rule is as follows:

If Temperature < 37 then Healthy

This classifier will classify all patients with a lower temperature than 37 degrees, as healthy. This classifier will have a huge error rate. The tree that represents this rule will have only the root node and two branches, with a leaf in each branch.

Underfitting occurs when the tree is too short to classify a new observation correctly; the rules are too general.

On the other hand, if we have a dataset with many attributes, and if we generate a very deep Decision Tree, we risk the fact that our Tree fits well with the training dataset, but not able to predict new examples. In our previous example, we can have a rule such as this:

If Temperature<27 and Sintom_A = V …… and Sintom_B = Y …....