Book Image

Mastering Python for Data Science

By : Samir Madhavan
Book Image

Mastering Python for Data Science

By: Samir Madhavan

Overview of this book

Table of Contents (19 chapters)
Mastering Python for Data Science
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
7
Estimating the Likelihood of Events
Index

Decision trees


To understand decision tree-based models, let's try to imagine that Google wants to recruit people for a software development job. Based on the employees that they already have and the ones they have rejected previously, we can determine whether an applicant was from an Ivy League college or not and what the Grade Point Average (GPA) of the applicant was.

The decision tree will split the applicants into Ivy League and non-Ivy League groups. The Ivy League group will then be split into high GPA and low GPA so that people with a high GPA are likely to be tagged highly and the ones with a low GPA are likely to get recruited.

Applicants who have a high GPA and belong to non-Ivy League colleges have a slightly better chance of getting recruited as compared to those who have a low GPA and belong to non-Ivy League colleges.

The preceding explanation is what a decision tree does in simple terms.

Let's create a decision tree on the basis of our data to predict what the likelihood of a...